Engineering Efficient LLM Inference: From Model Optimization to Scalable Systems
Engineering Efficient LLM Inference: From Model Optimization to Scalable Systems
28 Apr 202513:04pm - 28 Apr 202514:04pm
Engineering Efficient LLM Inference: From Model Optimization to Scalable Systems
About the Event
Large Language Models (LLMs) have set new benchmarks in AI—but turning their power into real-world products is no small feat. The true bottleneck? Inference. Running LLMs at scale demands fleets of GPUs, deep pockets, and serious engineering chops. In this session, we’ll go under the hood of how leading AI teams are slashing inference costs and boosting performance with smart model tweaks, system-level magic, and infrastructure hacks. Whether you're building AI products or scaling existing ones, this talk will equip you with practical insights to deploy LLMs efficiently—without burning through your cloud budget.
Key Takeaways:
- LLM inference is the hidden bottleneck in scaling AI applications efficiently and affordably.
- Deploying LLMs at scale requires system-level innovations and model optimizations to reduce cost and latency.
- Top AI companies are leveraging engineering strategies to make LLMs leaner, faster, and production-ready.
- The talk demystifies real-world deployment challenges and offers insights into building sustainable, scalable AI.
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
Who is this DataHour for?
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
About the Speaker
Currently a Machine Learning Engineer at Cohere, Rishit holds a Master’s in Computer Science from NYU Courant with a focus on ML and NLP. Passionate about building innovative AI products, they bring experience across retail, banking, and content creation. From developing intelligent systems for Fortune 500 clients to optimizing trading strategies with meta-ML platforms, their work blends research and real-world impact. They are driven by the elegance of mathematical algorithms that power meaningful AI solutions. You can reach him on LinkedIn.
Participate in discussion
Registration Details
Registered
Become a Speaker
Share your vision, inspire change, and leave a mark on the industry. We're calling for innovators and thought leaders to speak at our event
- Professional Exposure
- Networking Opportunities
- Thought Leadership
- Knowledge Exchange
- Leading-Edge Insights
- Community Contribution
