Building Reliable Agent using Advanced Rag Techniques, LangGraph, and Cohere LLM

Ritika Gupta 23 May, 2024 • 15 min read

Introduction

LLM Agents play an increasingly important role in the generative landscape as reasoning engines. But most of the agents have the shortcomings of failing or going into hallucinations. However, agents face formidable challenges within Large Language Models (LLMs), including context understanding, coherence maintenance, and dynamic adaptability. LangGraph, a sophisticated graph-based representation of language, aids agents in navigating and comprehending complex linguistic structures, fostering deeper semantic understanding. Advanced RAG techniques such as Adaptive RAG, Corrective RAG, and Self RAG help mitigate these issues with LLM Agents.

This article will use RAG Techniques to build reliable and fail-safe LLM Agents using LangGraph of LangChain and Cohere LLM.

A Beginner's Guide to Evaluating RAG Pipelines Using RAGAS

Learning Objectives

To learn to build a well-grounded LLM Agent
Understand and implement advanced RAG Techniques such as Adaptive, Corrective, and Self RAG.
To understand what are LLM Agents
To understand the differences between Langchain Agent and LangGraph and the advantages of Lang Graph over Langchain ReAct Agents
To know about the Lang Graph feature.

This article was published as a part of the Data Science Blogathon.

What is an Agent?
- Difference between Langchain Agent and LangGraph
What is LangGraph?
- Key Concepts of LangGraph
What is Tavily Search API?
What is Cohere LLM?
- Code Implementation
Frequently Asked Questions

What is an Agent?

The essential principle underlying agents is to use a language model to pick a series of actions. This sequence is hardcoded into the code when used in chains. In contrast, agents use a language model as a reasoning engine to choose which actions to do and in what order.

It comprises of 3 components:

Planning: breaking tasks into smaller sub-goals
memory: short term(chat history) / long term(vector store)
Tool Use: It can make use of different tools to extend its capabilities, such as internet search, sql query retriever

Agents can be created using the ReAct concept with Langchain or LangGraph.

Difference between Langchain Agent and LangGraph

1. Reliability: ReAct / Langchain Agent is less reliable as LLM has to make the correct decision at each step, whereas LangGraph is more reliable as the control flow is set. LLM performs a specific job at each node of the graph.

2. Flexibility: ReAct / Langchain Agent is more flexible as LLM can choose any sequence of action steps, whereas LangGraph is less flexible as actions are constrained by setting up the control flow at each node.

3. Compatibility with smaller LLMs: ReAct / Langchain Agent are not very compatible with smaller LLMs, whereas LangGraph is better compatible with smaller LLMs.

What is LangGraph?

LangGraph is a package that extends LangChain by enabling circular computing in LLM applications. LangGraph allows for the inclusion of cycles, whereas earlier LangChain allowed the definition of computation chains (Directed Acyclic Graphs or DAGs). This enables more complex, agent-like behaviors in which an LLM can be called in a loop to decide the next action to execute.

Key Concepts of LangGraph

1. Stateful Graph: LangGraph revolves around a stateful graph, where each node represents a step in your computation. The graph maintains a state passed around and updated as the computation progresses.

2. Nodes: Nodes are the building blocks of your LangGraph. Each node represents a function or a computation step. You define nodes to perform specific tasks, such as processing input, making decisions, or interacting with external APIs.

3. Edges: Edges connect the nodes in your graph, defining the computation flow. LangGraph supports conditional edges, allowing you to dynamically determine the next node to execute based on the current state of the graph.

What is Tavily Search API?

Tavily Search API is a search engine optimized for LLMs, aiming for efficient, quick, and persistent search results. Unlike other search APIs like Serp or Google, Tavily optimizes search for AI developers and autonomous AI agents.

What is Cohere LLM?

Cohere is an AI platform for the enterprise that specialises in large language model-powered solutions. Its main service is the Command R model (and the research open weights Command R+), which provides scalable and high-performance models that compete with offerings from firms such as OpenAI and Mistral.

Code Implementation

Workflow of the Agent

Based on the question, the Router decides whether to direct the question to retrieve context from the vector store or perform a web search.
If the Router decides the question has to be directed to retrieval from the vector store, then matching documents are retrieved from the vector store; otherwise, perform a web search using Tavily – API search
The document grader then grades the documents as relevant or irrelevant.
If the context retrieved is graded as relevant, use the hallucination grader to check for hallucination. If the grader decides the response is devoid of hallucination, then the response is the final answer to the user.
If the context is graded as irrelevant, perform a web search to retrieve the content.
Post retrieval, the document grader grades the content generated from the web search. If relevant, the response is synthesized using LLM and then presented.
This web-generated response is then passed through a hallucination checker, which checks whether the hallucination is present and takes the appropriate route based on the outcome, as shown in the workflow diagram.

Technology Stack Used

Embedding Model: Cohere Embed
LLM: Cohere Command R plus
Vector store: Chroma
Graph /Agent : LangGraph
Web Search API: Tavily Search API

Step 1 – Generate Cohere API Key

We need to generate the free API key to use Cohere LLM. Visit the website and log in using a Google account or GitHub account. Once logged in, you will land at a Cohere dashboard page, as shown below.

Click on the API Keys option. You will see a Trial Free API key is generated.

Step 2 – Generate Tavily Search API Key

Visit the sign-in page of the site here, log in using your Google Account and you will see a default-free plan for API key is generated called the “Research” plan.

Once you sign in using any account, you will land at the home page of your account, which will show a default-free plan with an API key generated, similar to the screen below.

Step 3 – Install Libraries

Now, once the API keys are generated, then we need to install the required
libraries as below. One can use colab notebooks for development.

!pip install --quiet langchain langchain_cohere tiktoken chromadb pymupdf

Step 4 – Set API Keys

Set the API Keys as environment variables

### Set API Keys
import os

os.environ["COHERE_API_KEY"] = "Cohere API Key"
os.environ["TAVILY_API_KEY"] = "Tavily API Key"

Step 5 – Building Vector Index

Build a vector index on top of the pdf using Cohere Embeddings.

from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_cohere import CohereEmbeddings
#from langchain_community.document_loaders import WebBaseLoader
from langchain_community.document_loaders import PyMuPDFLoader
from langchain_community.vectorstores import Chroma

# Set embeddings
embd = CohereEmbeddings()


# Load Docs to Index
loader = PyMuPDFLoader('/content/cleartax-in-s-income-tax-slabs.pdf')
data = loader.load()

#print(data[10])


# Split
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=512, chunk_overlap=0
)
doc_splits = text_splitter.split_documents(data)

# Add to vectorstore
vectorstore = Chroma.from_documents(persist_directory='/content/vector',
    documents=doc_splits,
    embedding=embd,
)

vectorstore_retriever = vectorstore.as_retriever()

Step 6 – Install Libraries Second Set

Install this second set of libraries. Don’t install all libraries together; otherwise, it will throw a dependency error.

!pip install langchain-openai langchainhub chromadb langgraph --quiet

Step 7 -Build Router

Now, we will build a router to route queries based on whether the query is related to the vector index. It is based on the Adaptive Advance RAG technique, which routes queries to suitable nodes.

### Router
from typing import Literal

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_cohere import ChatCohere


# Data model
class web_search(BaseModel):
    """
    The internet. Use web_search for questions that are related to anything else than agents, prompt engineering, and adversarial attacks.
    """

    query: str = Field(description="The query to use when searching the internet.")


class vectorstore(BaseModel):
    """
    A vectorstore containing documents related to to Income Tax of India New and Old Regime Rules. Use the vectorstore for questions on these topics.
    """

    query: str = Field(description="The query to use when searching the vectorstore.")


# Preamble
preamble = """You are an expert at routing a user question to a vectorstore or web search.
The vectorstore contains documents related to Income Tax of India New and Old Regime Rules.
Use the vectorstore for questions on these topics. Otherwise, use web-search."""

# LLM with tool use and preamble
llm = ChatCohere(model="command-r", temperature=0)
structured_llm_router = llm.bind_tools(
    tools=[web_search, vectorstore], preamble=preamble
)

# Prompt
route_prompt = ChatPromptTemplate.from_messages(
    [
        ("human", "{question}"),
    ]
)

question_router = route_prompt | structured_llm_router
response = question_router.invoke(
    {"question": "When will the results of General Elections 2024 of India be declared?"}
)
print(response.response_metadata["tool_calls"])
response = question_router.invoke({"question": "What are the income tax slabs in New Tax Regime?"})
print(response.response_metadata["tool_calls"])
response = question_router.invoke({"question": "Hi how are you?"})
print("tool_calls" in response.response_metadata)

Outputs

We can see output prints of the tool to which the query is routed, such as “web search” or “vector store,” and their corresponding response. When we ask questions about the general election, it does a web search. When we ask a query related to Tax Regime (our pdf), it directs us to the vector store.

[{'id': '1c86d1f8baa14f3484d1b99c9a53ab3a', 'function': {'name': 'web_search', 'arguments': '{"query": "General Elections 2024 of India results declaration date"}'}, 'type': 'function'}]
[{'id': 'c1356c914562418b943d50d61c2590ea', 'function': {'name': 'vectorstore', 'arguments': '{"query": "income tax slabs in New Tax Regime"}'}, 'type': 'function'}]
False

Step 8 -Build Retrieval Grader

Now, we will build a retrieval binary grader that will grade whether the retrieved documents are relevant to the query or not.

### Retrieval Grader


# Data model
class GradeDocuments(BaseModel):
    """Binary score for relevance check on retrieved documents."""

    binary_score: str = Field(
        description="Documents are relevant to the question, 'yes' or 'no'"
    )


# Prompt
preamble = """You are a grader assessing relevance of a retrieved document to a user question. \n
If the document contains keyword(s) or semantic meaning related to the user question, grade it as relevant. \n
Give a binary score 'yes' or 'no' score to indicate whether the document is relevant to the question."""

# LLM with function call
llm = ChatCohere(model="command-r", temperature=0)
structured_llm_grader = llm.with_structured_output(GradeDocuments, preamble=preamble)

grade_prompt = ChatPromptTemplate.from_messages(
    [
        ("human", "Retrieved document: \n\n {document} \n\n User question: {question}"),
    ]
)

retrieval_grader = grade_prompt | structured_llm_grader
question = "Old tax regime slabs"
docs = vectorstore_retriever.invoke(question)
doc_txt = docs[1].page_content
response = retrieval_grader.invoke({"question": question, "document": doc_txt})
print(response)

Output

binary_score='yes'

Step 9 -Response Generator

Now, we will build the Answer generator, which will generate an answer based on information obtained from the vector store or web search.

### Generate

from langchain import hub
from langchain_core.output_parsers import StrOutputParser
import langchain
from langchain_core.messages import HumanMessage


# Preamble
preamble = """You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise."""

# LLM
llm = ChatCohere(model_name="command-r", temperature=0).bind(preamble=preamble)

# Prompt
prompt = lambda x: ChatPromptTemplate.from_messages(
    [
        HumanMessage(
            f"Question: {x['question']} \nAnswer: ",
            additional_kwargs={"documents": x["documents"]},
        )
    ]
)

# Chain
rag_chain = prompt | llm | StrOutputParser()

# Run
generation = rag_chain.invoke({"documents": docs, "question": question})
print(generation)

Output

Under the old tax regime in India, there were separate slab rates for different categories of taxpayers. Taxpayers with an income of up to 5 lakhs were eligible for a rebate.

Step 10 – LLM Chain for Fallback

If the RAG chain fails, this LLM Chain will be the default chain for fallback scenarios. Note here in the prompt we don’t have the “documents” variable.

### LLM fallback

from langchain import hub
from langchain_core.output_parsers import StrOutputParser
import langchain
from langchain_core.messages import HumanMessage


# Preamble
preamble = """You are an assistant for question-answering tasks. Answer the question based upon your knowledge. Use three sentences maximum and keep the answer concise."""

# LLM
llm = ChatCohere(model_name="command-r", temperature=0).bind(preamble=preamble)

# Prompt
prompt = lambda x: ChatPromptTemplate.from_messages(
    [HumanMessage(f"Question: {x['question']} \nAnswer: ")]
)

# Chain
llm_chain = prompt | llm | StrOutputParser()

# Run
question = "Hi how are you?"
generation = llm_chain.invoke({"question": question})
print(generation)

Output

I don't have feelings as an AI chatbot, but I'm here to assist you with any questions or tasks you may have. How can I help you today?

Step 11- Building Hallucination Checker

Now, we will build a simple hallucination checker that will give a binary score of “Yes” or “No” based on whether the retrieved context is used to generate a final response free from hallucination and grounded in facts.

### Hallucination Grader


# Data model
class GradeHallucinations(BaseModel):
    """Binary score for hallucination present in generation answer."""

    binary_score: str = Field(
        description="Answer is grounded in the facts, 'yes' or 'no'"
    )


# Preamble
preamble = """You are a grader assessing whether an LLM generation is grounded in / supported by a set of retrieved facts. \n
Give a binary score 'yes' or 'no'. 'Yes' means that the answer is grounded in / supported by the set of facts."""

# LLM with function call
llm = ChatCohere(model="command-r", temperature=0)
structured_llm_grader = llm.with_structured_output(
    GradeHallucinations, preamble=preamble
)

# Prompt
hallucination_prompt = ChatPromptTemplate.from_messages(
    [
        # ("system", system),
        ("human", "Set of facts: \n\n {documents} \n\n LLM generation: {generation}"),
    ]
)

hallucination_grader = hallucination_prompt | structured_llm_grader
hallucination_grader.invoke({"documents": docs, "generation": generation})

Step 12- Building Answer Grader

This will be further checked after the hallucination grader passes on the response to this node. It will check whether the generated answer is relevant to the question.

### Answer Grader


# Data model
class GradeAnswer(BaseModel):
    """Binary score to assess answer addresses question."""

    binary_score: str = Field(
        description="Answer addresses the question, 'yes' or 'no'"
    )


# Preamble
preamble = """You are a grader assessing whether an answer addresses / resolves a question \n
Give a binary score 'yes' or 'no'. Yes' means that the answer resolves the question."""

# LLM with function call
llm = ChatCohere(model="command-r", temperature=0)
structured_llm_grader = llm.with_structured_output(GradeAnswer, preamble=preamble)

# Prompt
answer_prompt = ChatPromptTemplate.from_messages(
    [
        ("human", "User question: \n\n {question} \n\n LLM generation: {generation}"),
    ]
)

answer_grader = answer_prompt | structured_llm_grader
answer_grader.invoke({"question": question, "generation": generation})

Step 13- Building Web Search Tool

Now, we will build the web search tool using Tavily API.

### Search

from langchain_community.tools.tavily_search import TavilySearchResults

web_search_tool = TavilySearchResults()

Step 14- Building the Workflow of Graph

We will now capture the workflow of our Agent we define the class for maintaining the state of each decision point.

Steps involved in creating a graph using LangGraph:

Define the Graph State: This represents the state of the graph.
Create the Graph.
Define the Nodes: Here, we define the different functions associated with each workflow state.
Add nodes to the Graph: Here, add our nodes and define the flow using edges and conditional edges.
Set Entry and End Points of the Graph.

from typing_extensions import TypedDict
from typing import List


class GraphState(TypedDict):
    """|
    Represents the state of our graph.

    Attributes:
        question: question
        generation: LLM generation
        documents: list of documents
    """

    question: str
    generation: str
    documents: List[str]

Step 15- Building the Graph

We now define the Nodes of the graph and the edges of the graph.

from langchain.schema import Document


def retrieve(state):
    """
    Retrieve documents

    Args:
        state (dict): The current graph state

    Returns:
        state (dict): New key added to state, documents, that contains retrieved documents
    """
    print("---RETRIEVE---")
    question = state["question"]

    # Retrieval
    documents = vectorstore_retriever.invoke(question)
    return {"documents": documents, "question": question}


def llm_fallback(state):
    """
    Generate answer using the LLM w/o vectorstore

    Args:
        state (dict): The current graph state

    Returns:
        state (dict): New key added to state, generation, that contains LLM generation
    """
    print("---LLM Fallback---")
    question = state["question"]
    generation = llm_chain.invoke({"question": question})
    return {"question": question, "generation": generation}


def generate(state):
    """
    Generate answer using the vectorstore

    Args:
        state (dict): The current graph state

    Returns:
        state (dict): New key added to state, generation, that contains LLM generation
    """
    print("---GENERATE---")
    question = state["question"]
    documents = state["documents"]
    if not isinstance(documents, list):
        documents = [documents]

    # RAG generation
    generation = rag_chain.invoke({"documents": documents, "question": question})
    return {"documents": documents, "question": question, "generation": generation}


def grade_documents(state):
    """
    Determines whether the retrieved documents are relevant to the question.

    Args:
        state (dict): The current graph state

    Returns:
        state (dict): Updates documents key with only filtered relevant documents
    """

    print("---CHECK DOCUMENT RELEVANCE TO QUESTION---")
    question = state["question"]
    documents = state["documents"]

    # Score each doc
    filtered_docs = []
    for d in documents:
        score = retrieval_grader.invoke(
            {"question": question, "document": d.page_content}
        )
        grade = score.binary_score
        if grade == "yes":
            print("---GRADE: DOCUMENT RELEVANT---")
            filtered_docs.append(d)
        else:
            print("---GRADE: DOCUMENT NOT RELEVANT---")
            continue
    return {"documents": filtered_docs, "question": question}


def web_search(state):
    """
    Web search based on the re-phrased question.

    Args:
        state (dict): The current graph state

    Returns:
        state (dict): Updates documents key with appended web results
    """

    print("---WEB SEARCH---")
    question = state["question"]

    # Web search
    docs = web_search_tool.invoke({"query": question})
    web_results = "\n".join([d["content"] for d in docs])
    web_results = Document(page_content=web_results)

    return {"documents": web_results, "question": question}


### Edges ###


def route_question(state):
    """
    Route question to web search or RAG.

    Args:
        state (dict): The current graph state

    Returns:
        str: Next node to call
    """

    print("---ROUTE QUESTION---")
    question = state["question"]
    source = question_router.invoke({"question": question})

    # Fallback to LLM or raise error if no decision
    if "tool_calls" not in source.additional_kwargs:
        print("---ROUTE QUESTION TO LLM---")
        return "llm_fallback"
    if len(source.additional_kwargs["tool_calls"]) == 0:
        raise "Router could not decide source"

    # Choose datasource
    datasource = source.additional_kwargs["tool_calls"][0]["function"]["name"]
    if datasource == "web_search":
        print("---ROUTE QUESTION TO WEB SEARCH---")
        return "web_search"
    elif datasource == "vectorstore":
        print("---ROUTE QUESTION TO RAG---")
        return "vectorstore"
    else:
        print("---ROUTE QUESTION TO LLM---")
        return "vectorstore"


def decide_to_generate(state):
    """
    Determines whether to generate an answer, or re-generate a question.

    Args:
        state (dict): The current graph state

    Returns:
        str: Binary decision for next node to call
    """

    print("---ASSESS GRADED DOCUMENTS---")
    question = state["question"]
    filtered_documents = state["documents"]

    if not filtered_documents:
        # All documents have been filtered check_relevance
        # We will re-generate a new query
        print("---DECISION: ALL DOCUMENTS ARE NOT RELEVANT TO QUESTION, WEB SEARCH---")
        return "web_search"
    else:
        # We have relevant documents, so generate answer
        print("---DECISION: GENERATE---")
        return "generate"


def grade_generation_v_documents_and_question(state):
    """
    Determines whether the generation is grounded in the document and answers question.

    Args:
        state (dict): The current graph state

    Returns:
        str: Decision for next node to call
    """

    print("---CHECK HALLUCINATIONS---")
    question = state["question"]
    documents = state["documents"]
    generation = state["generation"]

    score = hallucination_grader.invoke(
        {"documents": documents, "generation": generation}
    )
    grade = score.binary_score

    # Check hallucination
    if grade == "yes":
        print("---DECISION: GENERATION IS GROUNDED IN DOCUMENTS---")
        # Check question-answering
        print("---GRADE GENERATION vs QUESTION---")
        score = answer_grader.invoke({"question": question, "generation": generation})
        grade = score.binary_score
        if grade == "yes":
            print("---DECISION: GENERATION ADDRESSES QUESTION---")
            return "useful"
        else:
            print("---DECISION: GENERATION DOES NOT ADDRESS QUESTION---")
            return "not useful"
    else:
        print("---DECISION: GENERATION IS NOT GROUNDED IN DOCUMENTS, RE-TRY---")
        return "not supported"

Step 16 – Build the Lang Graph

Add the nodes in the workflow and conditional edges. First, add all the nodes, then add the edges and define edges with conditions.

import pprint

from langgraph.graph import END, StateGraph

workflow = StateGraph(GraphState)

# Define the nodes
workflow.add_node("web_search", web_search)  # web search
workflow.add_node("retrieve", retrieve)  # retrieve
workflow.add_node("grade_documents", grade_documents)  # grade documents
workflow.add_node("generate", generate)  # rag
workflow.add_node("llm_fallback", llm_fallback)  # llm

# Build graph
workflow.set_conditional_entry_point(
    route_question,
    {
        "web_search": "web_search",
        "vectorstore": "retrieve",
        "llm_fallback": "llm_fallback",
    },
)
workflow.add_edge("web_search", "generate")
workflow.add_edge("retrieve", "grade_documents")
workflow.add_conditional_edges(
    "grade_documents",
    decide_to_generate,
    {
        "web_search": "web_search",
        "generate": "generate",
    },
)
workflow.add_conditional_edges(
    "generate",
    grade_generation_v_documents_and_question,
    {
        "not supported": "generate",  # Hallucinations: re-generate
        "not useful": "web_search",  # Fails to answer question: fall-back to web-search
        "useful": END,
    },
)
workflow.add_edge("llm_fallback", END)

# Compile
app = workflow.compile()

Step 17 – Install Libraries for Visualizing the Graph

We will now install additional libraries to visualize the workflow graph.

!apt-get install python3-dev graphviz libgraphviz-dev pkg-config
!pip install pygraphviz

Step 18 – Visualize the Graph

The dashed edges are conditional edges, whereas solid edges are non-conditional direct edges.

from IPython.display import Image

Image(app.get_graph().draw_png())

Step 19 – Execute the Workflow of Lang Graph

We now execute our workflow to check if it gives the desired output based on the defined workflow.

Example 1 – Web Search Query

# Execute
inputs = {
    "question": "Give the dates of different phases of general election 2024 in India?"
}
for output in app.stream(inputs):
    for key, value in output.items():
        # Node
        pprint.pprint(f"Node '{key}':")
        # Optional: print full state at each node
    pprint.pprint("\n---\n")

# Final generation
pprint.pprint(value["generation"])

Output


---ROUTE QUESTION---
---ROUTE QUESTION TO RAG---
---RETRIEVE---
"Node 'retrieve':"
'\n---\n'
---CHECK DOCUMENT RELEVANCE TO QUESTION---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---GRADE: DOCUMENT NOT RELEVANT---
---ASSESS GRADED DOCUMENTS---
---DECISION: ALL DOCUMENTS ARE NOT RELEVANT TO QUESTION, WEB SEARCH---
"Node 'grade_documents':"
'\n---\n'
---WEB SEARCH---
"Node 'web_search':"
'\n---\n'
---GENERATE---
---CHECK HALLUCINATIONS---
---DECISION: GENERATION IS GROUNDED IN DOCUMENTS---
---GRADE GENERATION vs QUESTION---
---DECISION: GENERATION ADDRESSES QUESTION---
"Node 'generate':"
'\n---\n'
('The 2024 Indian general election will take place in seven phases, with '
 'voting scheduled for: April 19, April 26, May 7, May 13, May 20, May 25, and '
 'June 1.')

Example 2 – Vector search query relevant

# Run
inputs = {"question": "What are the slabs of new tax regime?"}
for output in app.stream(inputs):
    for key, value in output.items():
        # Node
        pprint.pprint(f"Node '{key}':")
        # Optional: print full state at each node
        # pprint.pprint(value["keys"], indent=2, width=80, depth=None)
    pprint.pprint("\n---\n")

# Final generation
pprint.pprint(value["generation"])

Output

---ROUTE QUESTION---
---ROUTE QUESTION TO RAG---
---RETRIEVE---
"Node 'retrieve':"
'\n---\n'
---CHECK DOCUMENT RELEVANCE TO QUESTION---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---GRADE: DOCUMENT RELEVANT---
---ASSESS GRADED DOCUMENTS---
---DECISION: GENERATE---
"Node 'grade_documents':"
'\n---\n'
---GENERATE---
---CHECK HALLUCINATIONS---
---DECISION: GENERATION IS GROUNDED IN DOCUMENTS---
---GRADE GENERATION vs QUESTION---
---DECISION: GENERATION ADDRESSES QUESTION---
"Node 'generate':"
'\n---\n'
('Here are the slabs of the new tax regime for the given years:\n'
 '\n'
 '## FY 2022-23 (AY 2023-24)\n'
 '- Up to Rs 2,50,000: Nil\n'
 '- Rs 2,50,001 to Rs 5,00,000: 5%\n'
 '- Rs 5,00,001 to Rs 7,50,000: 10%\n'
 '- Rs 7,50,001 to Rs 10,00,000: 15%\n'
 '- Rs 10,00,001 to Rs 12,50,000: 20%\n'
 '- Rs 12,50,001 to Rs 15,00,000: 25%\n'
 '- Rs 15,00,001 and above: 30%\n'
 '\n'
 '## FY 2023-24 (AY 2024-25)\n'
 '- Up to Rs 3,00,000: Nil\n'
 '- Rs 3,00,000 to Rs 6,00,000: 5% on income above Rs 3,00,000\n'
 '- Rs 6,00,000 to Rs 900,000: Rs. 15,000 + 10% on income above Rs 6,00,000\n'
 '- Rs 9,00,000 to Rs 12,00,000: Rs. 45,000 + 15% on income above Rs 9,00,000\n'
 '- Rs 12,00,000 to Rs 1500,000: Rs. 90,000 + 20% on income above Rs '
 '12,00,000\n'
 '- Above Rs 15,00,000: Rs. 150,000 + 30% on income above Rs 15,00,000')

Conclusion

LangGraph is a versatile tool for developing complex, stateful applications employing LLMs. By understanding its essential ideas and working through basic examples, beginners can use its possibilities for their projects. Concentrating on maintaining states, handling conditional edges, and ensuring that the graph has no dead-end nodes is critical.
In my perspective, it is more advantageous than ReAct agents since we can establish total control of the workflow rather than having the agent make the decisions.

Key Takeaways

We learned about LangGraph and its implementations
We learned how to implement it using new tools such as Cohere LLM, Tavily
Search API
We were able to understand the difference between ReAct Agent and Lang Graph.

The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion.

Frequently Asked Questions

Q1. Is there Cohere API free to use?

A. Yes, Cohere currently allows free rate limited API calls for research
and prototyping here.

Q2. What are the advantages of Tavily Search API?

A. It is more optimized for searches with RAG and LLMs as compared to
other conventional search APIs.

Q3. What is the compatibility of LangGraph?

A. LangGraph offers compatibility with existing LangChain agents, allowing developers to modify AgentExecutor internals more easily. The state of the graph includes familiar concepts like input, chat_history, intermediate_steps, and agent_outcome.1

Q4. What are the further scopes of improvement in this strategy?

A. We can further enhance this Adaptive RAG strategy by integrating Self –
Reflection in RAG, which iteratively fetches documents with self-reasoning and
refines the answer iteratively.

Q5. What are the other LLM models offered by Cohere?

A. Cohere offers many different Models; the initial versions were – Command and Command R . Command R Plus is the latest multilingual model with a larger 128k context window. Apart from these LLM models, it also has an embedding model – Embed, and another ranking sorting model Rerank.

Ritika Gupta 23 May 2024

Advanced Github Langchain Large Language Models LLMs

Building Reliable Agent using Advanced Rag Techniques, LangGraph, and Cohere LLM

Introduction

Learning Objectives

Table of contents

What is an Agent?

Difference between Langchain Agent and LangGraph

What is LangGraph?

Key Concepts of LangGraph

What is Tavily Search API?

What is Cohere LLM?

Code Implementation

Workflow of the Agent

Technology Stack Used

Step 1 – Generate Cohere API Key

Step 2 – Generate Tavily Search API Key

Step 3 – Install Libraries

Step 4 – Set API Keys

Step 5 – Building Vector Index

Step 6 – Install Libraries Second Set

Step 7 -Build Router

Step 8 -Build Retrieval Grader

Step 9 -Response Generator

Step 10 – LLM Chain for Fallback

Step 11- Building Hallucination Checker

Step 12- Building Answer Grader

Step 13- Building Web Search Tool

Step 14- Building the Workflow of Graph

Steps involved in creating a graph using LangGraph:

Step 15- Building the Graph

Step 16 – Build the Lang Graph

Step 17 – Install Libraries for Visualizing the Graph

Step 18 – Visualize the Graph

Step 19 – Execute the Workflow of Lang Graph

Conclusion

Key Takeaways

Frequently Asked Questions

Frequently Asked Questions

Responses From Readers

Write for us