Build Custom FAQ Chatbot with BERT 

Sarvagya Agrawal 12 Jul, 2023 • 5 min read
FAQ Chatbot with BERT

Chatbots have become increasingly standard and valuable interfaces employed by numerous organizations for various purposes. They find numerous applications across different industries, such as providing personalized product recommendations to customers, offering round-the-clock customer support for query resolution, assisting with customer bookings, and much more. This article explores the process of creating a FAQ chatbot specifically designed for customer interaction. FAQ chatbots address questions within a specific domain, utilizing a predefined list of questions and corresponding answers. This type of chatbot relies on Semantic Question Matching as its underlying mechanism.

Learning Objectives

  • Understand the basics of the BERT model
  • Understanding Elasticsearch and its application in chatbot
  • The mechanism for creating the chatbots
  • Indexing and querying in Elasticsearch

This article was published as a part of the Data Science Blogathon.

What is BERT?

sBert | FAQ Chatbot

BERT, Bidirectional Encoder Representations from Transformers, is a large language model by Google in 2018. Unlike unidirectional models, BERT is a bidirectional model based on the Transformer architecture. It learns to understand the context of a word by considering both the words that come before and after it in a sentence, enabling a more comprehensive understanding.

One major challenge with BERT was that it could not achieve state-of-the-art performance for NLP tasks. The primary issue was that the token-level embeddings could not effectively use for textual similarity, resulting in poor performance when generating sentence embeddings.

However, Sentence-BERT (SBERT) was developed to address this challenge. SBERT is based on a Siamese Network, which takes two sentences at a time and converts them into token-level embeddings using the BERT model. It then applies a pooling layer to each set of embeddings to generate sentence embeddings. In this article, we will use SBERT for sentence embeddings.


Elastic Search is an open-source search and analytics engine that is very powerful, highly scalable, and designed to handle extensive data in real-time. Develop over the Apache Lucene library, which provides full-text search capabilities. Elasticsearch is highly scalable as it provides a highly distributed network to scale across multiple nodes, providing high availability and fault tolerance. It also offers a flexible and robust RESTful API, which allows interaction with the search engine using HTTP requests. It supports various programming languages and provides client libraries for easy application integration.

This article will teach us to create a FAQ chatbot with pre-trained BERT and Elasticsearch.

Step 1) Install SBERT Library

#install sentence transformers library
pip install sentence-transformers

Step 2) Generate Question Embeddings

We will use the SBERT library to get the embeddings for the predefined questions. For each question, it will generate a numpy array of a dimension of 768 which is equivalent to the size of general BERT token level embedding :

from sentence_transformers import SentenceTransformer

sent_transformer = SentenceTransformer("bert-base-nli-mean-tokens")

questions = [

    "How to improve your conversation skills? ",

    "Who decides the appointment of Governor in India? ",

    "What is the best way to earn money online?",

    "Who is the head of the Government in India?",

    "How do I improve my English speaking skills? "


ques_embedd = sent_transformer.encode(questions)

Step 3) Install Elasticsearch Library

pip install elasticsearch

Step 4) Creating Index in Elasticsearch

from elasticsearch import Elasticsearch

# defingin python client for elastic search
es_client = Elasticsearch("localhost:9200")

INDEX_NAME = "chat_bot_index"

#index dimensions for numpy array i.e. 768
dim_embedding = 768

def create_index() -> None:

    es_client.indices.delete(index=INDEX_NAME, ignore=404)





            "mappings": {

                "properties": {

                    "embedding": {

                        "type": "dense_vector",

                        "dims": dim_embedding,


                    "question": {

                        "type": "text",


                    "answer": {

                        "type": "text",







The process of creating an index in elastic search is very similar to the process of defining schema in any database. In the above code, we have created an index called “chat_bot_index,” which defines three fields, i.e., ’embedding,’ ‘question,’ and ‘answer,’ and their types i.e., “dense_vector” for “embeddings” and “text” for the other two.

Step 5) Indexing Question-answers in Elastic Search

def indexing_q(qa_pairs: List[Dict[str, str]]) -> None:

  for pair in qa_pairs:
      ques = pair["question"]
      ans = pair["answer"]
      embedding = sent_transformer.encode(ques)[0].tolist()
          data = {
              "question": questi,
              "embedding": embedding,
              "answer": ans,

qa_pairs = [{

    "question": "How to improve your conversation skills? ",

    "answer": "Speak more",


    "question": "Who decides the appointment of Governor in India? ",

    "answer": "President of India",


    "question": "How can I improve my English speaking skills? ",

    "answer": "More practice",



In the above code, we have indexed question-answer pairs in the elastic search database with the embeddings of the questions.

Step 6) Querying from Elasticsearch


def query_question(question: str, top_n: int=10) -> List[dict]:
  embedding = sentence_transformer.encode(question)[0].tolist()
      es_result =
              "from": 0,
              "size": top_n,
              "_source": ["question", "answer"],
              "query": {
                  "script_score": {
                      "query": {
                          "match": {
                              "question": question
                      "script": {
                          "source": """
                              (cosineSimilarity(params.query_vector, "embedding") + 1)
                              * params.encoder_boost + _score
                          "params": {
                              "query_vector": embedding,
                              "encoder_boost": ENCODER_BOOST,
      hits = es_result["hits"]["hits"]
      clean_result = []
      for hit in hits:
              "question": item["_source"]["question"],
              "answer": item["_source"]["answer"],
              "score": item["_score"],
  return clean_result

query_question("How to make my English fluent?")#import csv

We can modify the ES query by including a “script” field, enabling us to create a scoring function that calculates the cosine similarity score on embeddings. Combine this score with the overall ES BM25 matching score. To adjust the weighting of the embedding cosine similarity, we can modify the hyper-parameter called “ENCODER_BOOST.”


In this article, we explored the application of SBERT and Elasticsearch in creating the chatbot. We discussed creating a chatbot that would answer the queries based on predefined question-answers pairs considering the query’s intent.

Here are the key takeaways from our exploration:

  1. Grasping the significance of SBERT and Elasticsearch in the realm of chatbot development, harnessing their capabilities to enhance conversational experiences.
  2. Utilizing SBERT to generate embeddings for the questions enables a deeper understanding of their semantics and context.
  3. Leveraging Elasticsearch to establish an index that efficiently stores and organizes the question-answer pairs, optimizing search and retrieval operations.
  4. Demonstrating the query process in Elasticsearch, illustrating how the chatbot effectively retrieves the most relevant answers based on the user’s question.

Frequently Asked Questions

Q1. How is SBERT different from BERT?

A. SBERT extends BERT to encode sentence-level semantics, whereas BERT focuses on word-level representations. SBERT considers the entire sentence as a single input sequence, generating embeddings that capture the meaning of the entire sentence.

Q2. What can SBERT be used for?

A. Use SBERT for various natural language processing tasks such as semantic search, sentence similarity, clustering, information retrieval, and text classification. It enables comparing and analyzing the semantic similarity between sentences.

Q3. Can SBERT handle long documents?

A. SBERT is primarily designed for sentence-level embeddings. However, it can also handle short paragraphs or snippets of text. For longer documents, extracting sentence-level representations and aggregating them using techniques like averaging or pooling is common.

Q4. How does Elasticsearch work?

A. Elasticsearch operates as a distributed system, with data being divided into multiple shards that can be spread across different nodes in a cluster. Each shard contains a subset of the data and is fully functional, allowing for efficient parallel processing and high availability. When a search query is executed, Elasticsearch uses a distributed search coordination mechanism to route the query to the relevant shards, perform the search operations in parallel, and merge the results before returning them to the user.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion. 

Sarvagya Agrawal 12 Jul 2023

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

  • [tta_listen_btn class="listen"]