Building Responsible AI Agents with Guardrails and Safety in Action

About

In this practical session, participants will learn how to build autonomous AI agents using open-source LLMs and apply responsible AI principles through real-world guardrailing techniques. We will walk through the full pipeline — from creating a task-specific agent using LLaMA or Mistral-based models, to integrating NVIDIA NeMo Guardrails, Llama Guard, and prompt-based safety strategies.

We’ll cover critical safety challenges such as:

  • Prompt injection and jailbreaks
  • Toxicity and bias mitigation
  • Controlling agent autonomy and output

This session will include:

  • Setting up an AI agent for a real-world use case (e.g., customer support, knowledge
    assistant)
  • Injecting common adversarial prompts to test vulnerabilities
  • Applying NVIDIA NeMo Guardrails and Llama Guard to detect and prevent harmful
    outputs
  • Using prompt-based guardrailing as a first line of defense
  • Discussing practical limitations and failure cases in alignment and safety

Key Takeaways:

  • Understand responsible AI concepts in the context of agentic systems
  • Gain hands-on experience in building agents and securing them
  • Learn the strengths and limits of guardrail tools like NeMo Guardrails and Llama Guard
  • Walk away with a working demo, GitHub repo, and safety test checklist

Speaker

Book Tickets
Download Brochure

Download agenda