Building Real-World AI systems with Small Language Models

  • Aug 08, 2026
  • 09:30AM – 05:30PM

About the Workshop

This session provides a comprehensive introduction to Small Language Models (SLMs), covering what they are, why they matter, and how they fit within modern Generative AI systems alongside Large Language Models (LLMs). It will explore key trade-offs across size, cost, latency, accuracy, and sustainability, along with the core architectural principles behind lightweight transformer-based models.

The session will also cover essential techniques such as knowledge distillation, parameter-efficient fine-tuning, and model optimization approaches including quantization and pruning.

Building on these foundations, the workshop will transition into real-world implementation. Through progressive, hands-on exercises, participants will design, build, evaluate, and iteratively improve a multi-agent system—starting with baseline SLMs and enhancing performance using fine-tuned models and memory-augmented approaches.

In addition, the session will highlight practical scenarios where SLMs are most effective, including edge and on-device AI, high-volume low-cost workloads, real-time systems, domain-specific applications, privacy-sensitive use cases, and multi-agent systems for cost-efficient, production-grade AI deployments.