Empower Multiple Websites with Langchain’s Chatbot Solution

Tarun R Jain 26 Oct, 2023 • 7 min read

Introduction

In the revolutionary era of AI, conversational agents or chatbots have emerged as pivotal tools for engaging users, assisting, and enhancing user experience across various digital platforms. Chatbots powered by advanced AI techniques enable automated and interactive conversations that resemble human interactions. With the launch of ChatGPT, the ability to answer user queries has reached greater heights. Building Chatbots such as ChatGPT on custom data can help businesses with better user feedback and experience. In this article, we will build Langchain’s Chatbot solution, like ChatGPT, on multiple custom websites and the Retrieval Augmented Generation (RAG) technique. To begin with the project, we will first understand a few critical components to build such an application.

Langchain's Chatbot Solution

Learning Objectives

Here is what you will learn from this project: Large Language Chat Models

  • Building a Chatbot like ChatGPT on custom data.
  • Need of RAG – Retrieval Augmented Generation.
  • Using core components like Loaders, Chunking, and Embeddings to build ChatBot like ChatGPT.
  • Importance of In-memory Vector Databases using Langchain.
  • Implementing RetrievalQA chain using ChatOpenAI chat LLMs.

This article was published as a part of the Data Science Blogathon.

What is Langchain, and Why Use it?

To build a ChatBot like ChatGPT, a framework like Langchain comes in this step. We define the Large language model that is used to create the response. Ensure you use gpt-3.5-turbo-16k as the model while working with multiple data sources. This will occupy more number of tokens. Use this model name and avoid InvalidRequestError in handy. Langchain is an open-source framework designed to drive the development of applications driven by large language models(LLMs). At its core, LangChain facilitates the creation of applications that possess a crucial attribute and context awareness. These applications connect LLMs to custom data sources, including prompt instructions, few-shot examples, and contextual content. Through this vital integration, the language model can ground its responses in the provided context, resulting in a more subtle and informed interaction with the user.

LangChain provides a high-level API that makes it easy to connect language models to other data sources and build complex applications. With this, you can build applications such as Search Engine, Advanced Recommendation Systems, eBook PDF summarization, Question and answering agents, Code Assistant chatbots, and many more.

Understanding RAG- Retrieval Augmented Generation

RAG | Langchain's Chatbot Solution

Large language models are great when it comes to generating responses as a conventional AI. It can do various tasks like Code generation, mail writing, generating blog articles, and so on. But one huge disadvantage is the domain-specific knowledge. LLMs usually tend to hallucinate when it comes to answering domain-specific questions. To overcome challenges like reducing hallucinations and training the pre-trained LLMs with domain-specific datasets, we use an approach called Fine Tuning. Fine Tuning reduces hallucinations and best way to make a model learn about domain knowledge. But this comes with higher risks. Fine Tuning requires training time and computation resources that are a bit expensive.

RAG comes to the rescue. Retrieval Augmented Generation (RAG) ensures the domain data content is fed to LLM that can produce contextually relevant and factual responses. RAG not only acquires the knowledge but also requires no re-training of the LLM. This approach reduces the computation requirements and helps the organization to operate on a limited training infrastructure. RAG utilizes vector databases that also help in scaling the application.

Chat with Multiple Websites Workflow

The figure demonstrates the Workflow of the Chat with Multiple Websites project.

Chat with multiple websites | Langchain's Chatbot Solution

Let’s dive into the code to understand the components used in the Workflow.

Installation

You can install LangChain using the pip command. We can also install OpenAI to set up the API key.

pip install langchain
pip install openai
pip install chromadb tiktoken

Let’s set up the OpenAI API key.

In this project, we will use ChatOpenAI with a gpt-3.5-turbo-16k model and OpenAI embeddings. Both these components require an OpenAI API key. In order to get your API key, log in to platform.openai.com.

1. After you log into your account, click on your profile and choose “View API keys“.

2. Press “Create new secret key” and copy your API key.

Installation | Langchain's Chatbot Solution

Create an environment variable using the os library as per the syntax and paste your API key.

import os

os.environ['OPENAI_API_KEY'] = "sk-......zqBp" #replace the key

Data Source- ETL (Extract, Transform and Load)

In order to build a Chatbot application like ChatGPT, the fundamental requirement is custom data. For this project since we want to chat with multiple websites, we need to define the website URLs and load this data source via WebBaseLoader. Langchain loader such as WebBaseLoader scraps the data content from the respective URLs.

from langchain.document_loaders import WebBaseLoader

URLS = [
    'https://medium.com/@jaintarun7/getting-started-with-camicroscope-4e343429825d',
    'https://medium.com/@jaintarun7/multichannel-image-support-week2-92c17a918cd6',
    'https://medium.com/@jaintarun7/multi-channel-support-week3-2d220b27b22a'    
]

loader = WebBaseLoader(URLS)
data = loader.load()

Chunking

Chunking refers to a specific linguistic task that involves identifying and segmenting contiguous non-overlapping groups of words (or tokens) in a sentence that serve a common grammatical function. In simple words, Chunking helps break down the large text into smaller segments. Langchain provides text splitter supports like CharacterTextSplitter, which splits text into characters.

from langchain.text_splitter import CharacterTextSplitter

text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
websites_data = text_splitter.split_documents(data)

Embeddings

For a deep learning model dealing with text, one needs to pass the text into the Embedding layer. In a similar way to make the model learn the context, the chunked data needs to be converted into Embeddings. Embeddings are a way to convert words or tokens into numerical vectors. This transformation is crucial because it allows us to represent textual data, which is inherently discrete and symbolic, in a continuous vector space. Each word or token is represented by a unique vector.

from langchain.embeddings import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()

Vector Databases

The actual website data is extracted and converted into the Embeddings that are in vector form. Vector Databases are a unique way to store the embeddings in databases such as Chroma. The vector database is a new type of database that is becoming popular in the world of ML and AI. The key advantage of using a Vector database is because of searching techniques and similarity search. After getting the user query, the result of the similarity search and retrieval is usually a ranked list of vectors that have the highest similarity scores with the query vector. Using this metric, the application ensures returning factual responses.

A few of the commonly used and popular Open Source vector databases are Chroma, Elastic Search, Milvus, Qdrant, Weaviate, and FAISS.

from langchain.vectorstores import Chroma

websearch = Chroma.from_documents(websites_data, embeddings)

Large Language Chat Models

In this step, we define the Large language model, that is used to create the response. Make sure you use gpt-3.5-turbo-16k as the model while working with multiple data sources. This will occupy more number of tokens. Use this model name and avoid InvalidRequestError.

from langchain.chat_models import ChatOpenAI

model = ChatOpenAI(model='gpt-3.5-turbo-16k',temperature=0.7)

User Prompt and Retrieval

We have reached the final part of the project, where we get the input prompt and use a vector database retriever to retrieve the relevant context for the entered prompt. RetrievalQA stacks up both large language models and vector databases in a chain that helps in better response.

from langchain.chains import RetrievalQA

rag = RetrievalQA.from_chain_type(llm=model, chain_type="stuff", retriever=websearch.as_retriever())

prompt = "Write code implementation for Multiple Tif image conversion into RGB"
response = rag.run(prompt)
print(response)

Output

"

Putting All together

#installation
!pip install langchain openai tiktoken chromadb

#import required libraries
import os
from getpass import getpass
from langchain.document_loaders import WebBaseLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI

#set up OpenAI API Key
api_key = getpass()
os.environ['OPENAI_API_KEY'] = api_key

#ETL=> load the data
URLS = [
    'https://medium.com/@jaintarun7/getting-started-with-camicroscope-4e343429825d',
    'https://medium.com/@jaintarun7/multichannel-image-support-week2-92c17a918cd6',
    'https://medium.com/@jaintarun7/multi-channel-support-week3-2d220b27b22a'    
]

loader = WebBaseLoader(URLS)
data = loader.load()

#Chunking => Text Splitter into smaller tokens
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
websites_data = text_splitter.split_documents(data)

#create embeddings
embeddings = OpenAIEmbeddings()

#store embeddings and the data inside Chroma - vector database
websearch = Chroma.from_documents(websites_data, embeddings)

#define chat large language model-> 16K token size
model = ChatOpenAI(model='gpt-3.5-turbo-16k',temperature=0.7)
Is LangChain a library or framework?
#retrieval chain
rag = RetrievalQA.from_chain_type(llm=model, chain_type="stuff", retriever=websearch.as_retriever())

#retrieve relevant output
prompt = "Write code implementation for Multiple Tif image conversion into RGB"
#run your query
response = rag.run(prompt)
print(response)

Conclusion

To conclude with the article, we have successfully built a Chatbot for multiple websites using Langchain. This is not just a simple Chatbot. Rather, it is a Chatbot that answers like ChatGPT but on your data. The key takeaway from this article is:

  • Langchain is the most potent Large Language Model open-source framework that helps build ChatBot like ChatGPT on custom data.
  • We discussed different challenges with pre-trained models and how the Retrieval Augmented Generation approach suits better than Fine Tuning. Also, one disclaimer: Fine fine-tuning is preferred over RAG to get more factual responses in most cases.
  • To build a ChatBot like ChatGPT, we prepared a project workflow with core components such as loaders, chunking, embedding, vector databases, and chat language models.
  • Vector databases are the key advantage of RAG pipelines. This article also mentioned a list of popular open-source Vector databases.

This project use case has inspired you to explore the potential of the Langchain and RAG.

Frequently Asked Questions

Q1. What is the difference between LangChain and LLM?

A. Large language model(LLM) is a transformer-based model that generates text data based on the user prompt, whereas LangChain is a framework that provides LLM as one component alongside various other components such as memory, vectorDB, embeddings, and so on.

Q2. What is Langchain used for?

A. Langchain is a robust open-source framework used to build ChatBot like ChatGPT on your data. With this framework, you can build various applications such as Search applications, Question and answering bots, Code generation assistants, and more.

Q3. Is Langchain a library or framework?

A. Langchain is an open-source framework designed to drive the development of applications driven by large language models(LLMs). At its core, LangChain facilitates the creation of applications that possess a crucial attribute and context awareness.

Q4. What is the RAG technique in LLM?

A. Retrieval Augmented Generation (RAG) ensures the domain data content is fed to LLM that can produce contextually relevant and factual responses.

Q5. What is RAG vs Fine-tuning?

A. RAG is a technique that combines the LLM and information knowledge store to generate a response. The core idea behind RAG is knowledge transfer that requires no training of the model, whereas Fine Tuning is a technique where we impose the data to the LLM and re-train the model for external knowledge retrieval.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Tarun R Jain 26 Oct 2023

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers