Hardik Meisheri

Hardik Meisheri

Senior Applied Scientist

About

Hardik Meisheri is a Senior Applied Scientist at Microsoft AI, with over a decade of experience in reinforcement learning and machine learning. His work spans the full arc of RL's evolution — from classical control problems like supply chain optimization and multi-agent coordination at TCS Research, to fine-tuning LLMs with RLHF for ad policy understanding at Amazon Advertising, to now building large-scale foundational models at Microsoft AI. Across these roles, he has tackled many of the core challenges that underpin today's LLM-based agents, designing effective reward signals, improving sample efficiency in complex environments, scaling multi-agent systems under real-world constraints, and aligning model behavior with production-level objectives.