John Gilhuly

John Gilhuly

Head of Developer Relations

Arize AI

John is the Head of Developer Relations at Arize AI, focused on open-source LLM observability and evaluation tooling. He holds an MBA from Stanford, where he specialized in the ethical, social, and business implications of AI development, and a B.S. in C.S. from Duke. Prior to joining Arize, John led GTM activities at Slingshot AI, and served as a venture fellow at Omega Venture Partners. In his pre-AI life, John built out and ran technical go-to-market teams at Branch Metrics.

A full-day, hands-on workshop designed to equip participants with the end-to-end knowledge and practical skills to build, evaluate, monitor, and improve Agentic AI systems. Through seven comprehensive modules, participants will explore LLM fundamentals, prompt optimization, observability, evaluation, and system productionization with real-world use cases. You will spend the entire day focusing on the following key areas:

  • Understand the architecture, components, and unique challenges of LLM agents versus traditional applications.
  • Learn how to implement observability and tracing to monitor and debug complex LLM systems effectively.
  • Explore robust evaluation techniques, including human feedback, LLM-based scoring, and agent-specific metrics.
  • Master advanced prompt engineering, tool use strategies, and optimization techniques for building capable agents.
  • Apply concepts in a real-world capstone project with hands-on experience in deploying, monitoring, and improving agentic AI systems.

In this workshop, participants will work with various tools for building and optimizing LLM applications and agents. These include tracing and observability platforms for prompt tracking, token usage, and latency monitoring; evaluation frameworks using A/B testing, LLM-as-judge methods, and benchmark datasets; prompt engineering utilities for few-shot learning and chain-of-thought techniques; vector databases and RAG tools for context retrieval; caching and model optimization tools for performance tuning; agent architecture components like memory and planning modules; production monitoring dashboards for KPIs and degradation tracking; and A/B testing infrastructure for continuous improvement.

Prerequisites:

  • A solid understanding of python and GenAI applications
  • Familiarity with Google colab, or local python development environments

*Note: These are tentative details and are subject to change.

Read More

As AI agents become more capable and integrated into real-world workflows, evaluating and improving their performance is essential. This hands-on workshop equips participants with the tools and strategies needed to build smarter, safer, and more effective AI agents. From tracing and prompt optimization to evaluation frameworks and production monitoring, attendees will gain practical skills to assess, optimize, and scale agentic systems.

Read More

Managing and scaling ML workloads have never been a bigger challenge in the past. Data scientists are looking for collaboration, building, training, and re-iterating thousands of AI experiments. On the flip side ML engineers are looking for distributed training, artifact management, and automated deployment for high performance

Read More

Managing and scaling ML workloads have never been a bigger challenge in the past. Data scientists are looking for collaboration, building, training, and re-iterating thousands of AI experiments. On the flip side ML engineers are looking for distributed training, artifact management, and automated deployment for high performance

Read More