Meet LangSmith Assistant – Polly [An Agent for Agents]

Soumil Jain Last Updated : 16 Dec, 2025
5 min read

Let’s be honest! Building AI agents is exciting but debugging them, not so much. As we are pushing the boundaries of agentic AI the complexity of our system is skyrocketing. We have all been there staring at a trace with hundreds of steps, trying to figure out why agent hallucinated or chose the wrong tool. Integrated into LangSmith, Polly is an AI-powered assistant designed to help developers debug, analyze, and engineer better agents. It is a meta layer of intelligence, ironically an Agent for Agents. This article goes over Polly setup, its capabilities, and how it helps in creating better agents.

Why Do We Need an Agent for Agents?

The transition from simple LLM chains to autonomous agents has introduced a new class of debugging challenges that manual inspection can no longer solve efficiently. Langchain identified that agents are fundamentally harder to engineer due to three factors: 

  1. Massive System prompts: Instructions often span hundreds or thousands of lines making it nearly impossible to pinpoint which specific sentence caused a behaviour degradation. 
  2. Deep execution Traces:  When a single agent runs it generates thousands of data points across multiple steps, creating a volume of logs that is overwhelming for a human review. 
  3. Long-Context State: Multi-turn conversations can span hours or days, requiring a debugger to understand the entire interaction history to diagnose why a decision was made. 

Polly solves this by acting as a partner that understands agent’s architectures, allowing you to bypass manual log scanning and instead ask natural language questions about your system’s performance. 

How to Set Up Polly?

Since Polly is an embedded feature of LangSmith, you don’t install Polly directly. Instead, you enable LangSmith tarcing in your application. Once your agent’s data is flowing into the platform, Polly activates automatically. 

Step 1: Install LangSmith 

First, ensure you have LangSmith SDK in your environment. Run the following command in the command line of your operating system:

pip install –U langsmith 

Step 2: Configure environment variables 

Get your API key from the LangSmith setting page and set the folowing environment variables. This tells your application to start logging traces to LangSmith cloud. 

import os 

# Enable tracing (required for Polly to see your data) 
os.environ["LANGSMITH_TRACING"] = "true" 

# Set your API Key 
os.environ["LANGSMITH_API_KEY"] = "ls__..." 

# Optional: Organize your traces into a specific project 
os.environ["LANGSMITH_PROJECT"] = "my-agent-production"

Step 3: Run Your Agent 

That’s it, If you’re using LangChain, tracing is automatic. If, you’re using the OpenAI SDK directly wrap your client to enable visibility. 

from openai import OpenAI 
from langsmith import wrappers 

# Wrap the OpenAI client to capture inputs/outputs automatically 
client = wrappers.wrap_openai(OpenAI()) 

# Run your agent as normal 
response = client.chat.completions.create( 
model="gpt-4o", 
messages=[{"role": "user", "content": "Analyze the latest Q3 financial report."}] 
)

Once you run the above steps, navigate to the trace view or threads view in the LangSmith UI. You will see a Polly icon in the bottom right corner. 

Polly’s Core Capabilities 

Polly is not just a chatbot wrapper. It is deeply integrated into the LangSmith infrastructure to perform three critical tasks: 

Task 1: Deep Trace Debugging

In the Trace view, Polly analyses individual agent executions to identify subtle failure modes that might be buried in the middle of a long run. You can ask specific diagnostic questions like: 

  • “Did the agent make any mistakes?” 
  • “Where exactly things go wrong”  
  • “Why did the agent choose this approach instead of that one” 
Deep Trace Debugging
Debugging Traceback

Polly doesn’t just surface information. It understands agent behaviour patterns and can identify issues you’d miss. 

Task 2:  Thread-level Context Analysis

Debugging state is notoriously difficult, especially when an agent works fine for ten turns and fails on the eleventh. Polly can access information from entire conversation threads, allowing it to spot patterns over time, summarize interactions, and identify exactly when and why an agent lost track of critical context. 

You can ask questions like: 

  • “Summarize what happened across multiple interactions” 
  • “Identify patterns in agent behaviour over time” 
  • “Spot when the agent lost track of important context” 
Thread-Level Content Analysis
Content Analysis

This is especially powerful for debugging those frustrating issues where the agent was working fine and then suddenly it wasn’t. Polly can pinpoint exactly where and why things changed.

Task 3: Automated Prompt Engineering

Perhaps the most powerful feature for developers is Polly’s ability to act as an expert prompt engineer. The system prompt is the brain of any deep agent, and Polly can help iterate on it. You can describe the desired behaviour in natural language, and polly will update the prompt, define structured output schemas, configure tool definitions, and optimize prompt length without losing critical instructions.

Automated Prompt Engineering
Automated Prompt Engineering

How it Works Under the Hood?

Polly’s intelligence is built on top of LangSmith robust tracing infrastructure which captures everything your agent does. It ingests three layers of data. 

  1. Runs: Individual steps like LLM calls and tool executions 
  2. Traces: A single execution of your agent, made up of a tree of runs. 
  3. Threads: A full conversation, containing multiple traces. 

Because LangSmith already captures the inputs, outputs, latency, and token counts for every step, Polly has perfect information about the agent’s world. It doesn’t need to guess what happened.  

Conclusion

Polly represents a significant shift in how we approach the lifecycle of AI development. It acknowledges that as our agents become more autonomous and complex, the tools we use to maintain them must evolve in parallel. By transforming debugging from a manual, forensic search through logs into a natural language dialogue, Polly allows developers to focus less on hunting for errors and more on architectural improvements. Ultimately, having an intelligent partner that understands your system’s state isn’t just a convenience, it is becoming a necessity for engineering the next generation of reliable, production-grade agents. 

Frequently Asked Questions

Q1. What problem does Polly actually solve?

A. It helps you debug and analyze complex agents without digging through enormous prompts or long traces. You can ask direct questions about mistakes, decision points, or odd behavior, and Polly pulls the answers from your LangSmith data. 

Q2. How do I enable Polly in my project?

A. You just turn on LangSmith tracing with the SDK and your API key. Once your agent runs and logs show up in LangSmith, Polly becomes available automatically in the UI. 

Q3. What makes Polly different from a normal chatbot?

A. It has full access to runs, traces, and threads, so it understands how your agent works internally. That context lets it diagnose failures, track long-term behavior, and even help refine system prompts. 

I am a Data Science Trainee at Analytics Vidhya, passionately working on the development of advanced AI solutions such as Generative AI applications, Large Language Models, and cutting-edge AI tools that push the boundaries of technology. My role also involves creating engaging educational content for Analytics Vidhya’s YouTube channels, developing comprehensive courses that cover the full spectrum of machine learning to generative AI, and authoring technical blogs that connect foundational concepts with the latest innovations in AI. Through this, I aim to contribute to building intelligent systems and share knowledge that inspires and empowers the AI community.

Login to continue reading and enjoy expert-curated content.

Responses From Readers

Clear