Measuring Uncertainty in LLMs and Optimal Use of SLMs

About

Large Language Models (LLMs) are redefining NLP with their remarkable reasoning capabilities, but they still hallucinate, making up facts that can derail decision-critical tasks like clinical trial matching or medical entity extraction. In this session, we’ll explore how understanding and quantifying uncertainty can help tackle this reliability gap.

We’ll demystify uncertainty vs. confidence, break down aleatoric vs. epistemic uncertainty, and walk through estimation techniques for white-box (e.g., LLaMA), grey-box (e.g., GPT-3), and black-box (e.g., GPT-4) models. Expect hands-on demonstrations using open-source LLMs and tools, with a reality check on why SoftMax scores alone can be misleading.

We’ll also shine a spotlight on Small Language Models (SLMs) on why they’re not just cheaper, but potentially more predictable and controllable, offering a compelling alternative for hallucination-sensitive use cases.

Whether you're deploying LLMs in production or experimenting with SLMs, this talk will equip you with tools to make your models more trustworthy.

Key Takeaways:

  • Understand why LLMs hallucinate—and how uncertainty estimation can help quantify and mitigate these failures.
  • Learn the three levels of uncertainty estimation (white-box, grey-box, black-box) across different LLM access models.
  • Discover why SoftMax scores can be misleading, and how better statistical tools offer deeper insight into model reliability.
  • Explore how Small Language Models (SLMs) may offer a more controllable and robust path to trustworthy NLP applications.

Speaker

Book Tickets
Download Brochure

Download agenda