From Idea to Production with GenAI : Realizing the Art of the Possible

About

In this session, I will share practical insights from actual production deployments of GenAI applications across multiple industries. Drawing from my experience with AWS, I will demonstrate:

1. Building a Customer Service Solution for Production - You will learn how we:

  • Implemented multilingual support through fine-tuned speech models
  • Achieved high throughput through strategic infrastructure scaling
  • Cut latency using NVIDIA TensorRT-LLM compilation
  • Deployed custom containers with Triton Inference Server

2. Automated Cricket Scene Analysis with Vision Language Models - I will show how we:

  • Reduced 45-50 minutes of manual processing per game to automated analysis
  • Built models to identify replays, bowler run-ups, and scorecard data
  • Developed resolution-agnostic, font-adaptive models for varying broadcast qualities
  • Optimized performance through Lora and Hyperpod fine-tuning

3. Gen AI-based Data Analyst - I will demonstrate how we:

  • Transform natural language queries into SQL and visualization code
  • Build conversational analytics assistants for complex data exploration
  • Set up secure database connections with proper authentication
  • Generate optimized data visualizations through LLM-powered code
  • Enable business users to perform sophisticated analytics without SQL knowledge

Key Takeaways:

  • Learn practical architectures to deploy GenAI in production environments
  • Master strategies to balance performance and cost through model selection and optimization
  • Implement techniques to ensure reliability and prevent hallucinations
  • Apply methods to optimize latency, throughput, and scalability
  • Adopt cost optimization approaches including model compression and caching strategies

Speaker

Book Tickets
Download Brochure

Download agenda