Abhishek Divekar

Senior Applied Scientist

Abhishek Divekar is a Senior Applied Scientist in Amazon's International Machine Learning team. His work has driven over half a billion dollars in revenue growth for Amazon and led to the deployment of 1,000+ ML models worldwide. He has authored multiple papers at Tier-1 AI conferences, pioneering fundamental research in areas including Synthetic Dataset Generation, Retrieval-Augmented Generation, and LLM-as-a-Judge, while also leading major open-source scientific projects. Abhishek earned his MS in Computer Science from The University of Texas at Austin and holds a B.Tech. from VJTI, Mumbai.

Synthetic data is transforming the landscape of training foundational models such as GPTs and Stable Diffusion, by enabling the creation of diverse, privacy-conscious, and annotation-efficient datasets. In this illuminating session, we will trace the frontier of synthetic data generation. We'll discuss generative AI techniques that are reshaping industries, demonstrating how synthetic datasets created by LLMs, diffusion models, and hybrids can augment or even replace traditional human-curated data. We'll highlight the pitfalls of careless generation at scale, including the amplification of hallucinations and entrenched biases, and offer practical strategies for safeguarding data quality. You'll learn how to ground synthetic data in real-world contexts, leveraging distributional similarity metrics and LLM-as-a-Judge to reliably benchmark synthetic versus human data. Join us to discover how responsible synthetic data practices can drive a more robust, ethical, and innovative AI-powered future.

Managing and scaling ML workloads have never been a bigger challenge in the past. Data scientists are looking for collaboration, building, training, and re-iterating thousands of AI experiments. On the flip side ML engineers are looking for distributed training, artifact management, and automated deployment for high performance

View all speakers

Abhishek Divekar

Hack Sessions The Promise and Pitfalls of Synthetic Data Generation Abhishek Divekar Senior Applied Scientist

Keynote 10:00 - 11.30AM Generative AI and I – Understanding what the new iPhone moment means to us Arnav Garg Data scientist at Fractal Arnav Garg Data scientist at Fractal

Powertalk 10:00 - 11.30AM • AUDI 1 Generative AI and I – Understanding what the new iPhone moment means to us Arnav Garg Data scientist at Fractal

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt

lang

s_tp

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

s_pltp

s_tslv

li_theme

li_theme_set

Google (11)

_gcl_au

SID

SAPISID

__Secure-#

APISID

SSID

HSID

DV

NID

1P_JAR

OTZ

Facebook (2)

_fbp

fr

LinkedIn (6)

bscookie

lidc

bcookie

aam_uuid

UserMatchHistory

li_sugr

Microsoft (2)

MR

ANONCHK

04

10

19

48