Dr. Aditya Bhattacharya

Dr. Aditya Bhattacharya

Applied AI/ML Lead

About

Aditya is the Applied AI Lead at Nutanix, where he works on advancing applied AI and machine learning solutions for Nutanix Panacea AI. He brings over 10 years of experience in data science, machine learning, IoT, and software development, along with several years of leadership and team management experience. Aditya holds a PhD in Explainable AI from KU Leuven and is the author of the well-known book Applied Machine Learning Explainability Techniques. He is also an active contributor to the AI community, having spoken at conferences such as AAAI, ACM, ODSC, Indo Data Week, and GIDS, and regularly shares his learnings through platforms like Towards Data Science, Medium, and YouTube.

As agentic AI systems evolve from simple prompt-response pipelines into complex, multi-step, tool-using architectures, traditional evaluation approaches fall short. End-to-end benchmarks alone cannot explain why an agent fails, while component-level metrics often miss emergent behaviours across the system. This talk introduces a multi-layered evaluation and observability framework designed to make agentic systems measurable, debuggable, and production-ready.
 
We begin with end-to-end evaluation strategies, including the design of high-quality golden datasets tailored to agent workflows, enabling reliable measurement of real-world task success. We then zoom into component-level evaluation, breaking down agent pipelines: planning, tool selection, memory usage, and reasoning, to identify failure modes with precision.
 
The session further explores observability patterns for modern AI systems, including tracing, structured logging, and instrumentation using platforms like LangFuse, as well as codeless observability via reverse proxy gateways for MCP-based services.
 
By connecting evaluation with observability, this session provides a practical blueprint for moving from opaque, brittle agents to transparent, reliable, and continuously improving AI systems. Attendees will leave with concrete techniques, architectural patterns, and mental models to evaluate and operate agentic systems at scale.
Read More →