API to Autonomy: Architecting LLM Workloads at Hyperscale

API to Autonomy: Architecting LLM Workloads at Hyperscale

08 May 202513:05pm - 08 May 202514:05pm

API to Autonomy: Architecting LLM Workloads at Hyperscale

About the Event

“Why rely on black-box APIs when you can build and scale your own LLM stack with better control over cost, reliability, and performance?”

LLM APIs are great—until cost spikes, latency issues, and lack of control slow you down. In this podcast-style session, we’ll dive into how organizations can take back control by deploying and scaling their own massive models (think 400B+ parameters). From provisioning strategies and HPC-powered infrastructure to workflow design and cost optimization, we’ll break down what it really takes to run LLMs your way. Whether you're tired of vendor limitations or ready to build a custom LLM stack that scales with you, this session delivers the insights to make it happen.


Key Takeaways:

  • LLM APIs offer ease, but self-hosting brings control, customization, and cost-efficiency.
  • Learn how to deploy and scale massive models (400B+ parameters) using high-performance compute (HPC).
  • Explore real-world strategies for infrastructure planning, reliability, and scalable LLM deployment.
  • Build an LLM stack tailored to your needs—free from third-party limitations and vendor lock-in.

Along with Bhawna, we have Rishabh Bhardwaj, a Research Associate at Technische Hochschule Deggendorf, specializing in high-performance and quantum computing. With a background in platform engineering and over three years of hands-on experience, he has led the deployment of large-scale systems, including Spark clusters, vector databases, and machine learning models. His current work focuses on pushing the boundaries of infrastructure to support scalable and efficient deployment of large language models. You can reach him on LinkedIn.

  1. Best articles get published on Analytics Vidhya’s Blog Space
  2. Best articles get published on Analytics Vidhya’s Blog Space
  3. Best articles get published on Analytics Vidhya’s Blog Space
  4. Best articles get published on Analytics Vidhya’s Blog Space
  5. Best articles get published on Analytics Vidhya’s Blog Space

Who is this DataHour for?

  1. Best articles get published on Analytics Vidhya’s Blog Space
  2. Best articles get published on Analytics Vidhya’s Blog Space
  3. Best articles get published on Analytics Vidhya’s Blog Space

About the Speaker

Bhawna Rupani

Bhawna Rupani

Senior Gen AI Engineer at Cygeniq

Bhawna Rupani is currently working as a Senior Gen AI Engineer at a US-based cybersecurity & AI based start-up, Cygeniq, where she is building Agentic AI frameworks. Previously worked with leading European travel tech and top Ed-techs of India. She is also associated with the government of India's PMO initiatives for helping transition college students into the AI/ML industry. You can reach her on LinkedIn.

Participate in discussion

Registration Details

2381

Registered

Become a Speaker

Share your vision, inspire change, and leave a mark on the industry. We're calling for innovators and thought leaders to speak at our event

  • Professional Exposure
  • Networking Opportunities
  • Thought Leadership
  • Knowledge Exchange
  • Leading-Edge Insights
  • Community Contribution