Yogesh Kulkarni — January 21, 2018
Intermediate NLP Project Python Sequence Modeling Supervised Text Unstructured Data

Introduction

What do we do when we need any information? Simple: “We Ask, and Google Tells”.

But if the answer depends on multiple variables, then the existing Ask-Tell model tends to sputter. State of the art search engines usually cannot handle such requests. We would have to search for information available in bits and pieces and then try to filter and assemble relevant parts together. Sounds time consuming, doesn’t it?

Source: Inbenta

This Ask-Tell model is evolving rapidly with the advent of chatbots (also referred to as just “bots”).

This article talks about the development of a bot for extracting information related to the recently introduced Goods and Services Tax (GST) in India. Being a relatively new concept, a lot of people are still trying to understand how it works. Chatbots can provide such information in a natural and conversational way. This article demonstrates building a chatbot for answering queries related to GST. Let’s call this a GST-FAQ Bot!

 

Table of Contents

  1. Chatbots and NLP
  2. Why Rasa-NLU?
  3. Building GST FAQ bot architecture
  4. Installation
  5. Server
    • Steps to build the server of the chatbot
  6. Client
  7. Engine
  8. Our chatbot in action

 

Chatbots and NLP

To know more about GST, like how to apply for registrations, tax slabs, etc., companies have posted Frequently Asked Questions (FAQs) on their websites. Going through that amount of information can be a painstaking process. In these situations, chatbots come in handy, effective and thus, have become enormously popular.

These days, Natural Language Processing (NLP), especially its component Natural Language Understanding (NLU), has allowed bots to have a greater understanding of language and context. They are becoming more intelligent in understanding the meaning of the search and can return very specific, context-based information.

Applications like WhatsApp, Facebook Messenger, Slack, etc. are increasingly being used by businesses. Bots are starting to replace websites-interface as well. From the considerable number of choices available for building a chatbot, this particular implementation uses the RASA-NLU library in Python.

 

Why RASA-NLU?

Many chatbot platforms are currently available, from rudimentary rule-based AIML (Artificial Intelligence Markup Language), to highly sophisticated AI bots. Some popular chatbot platforms are API.ai, Wit.ai, Facebook APIs, Microsoft LUIS, IBM Watson, etc.

RASA-NLU builds a local NLU (Natural Language Understanding) model for extracting intent and entities from a conversation. It’s open source, fully local and above all, free! It is also compatible with wit.ai, LUIS, or api.ai, so you can migrate your chat application data into the RASA-NLU model.

Below is a demonstration on how to install RASA-NLU and build a simple FAQ bot in Python.

 

Building GST FAQ bot architecture

A chatbot is a client-server application. In the case of RASA-NLU, even the server can be local. The client is nothing but the chabot UI. The interaction and architecture can be understood by the following diagram:

 

Installations

RASA can be installed and configured on a standalone machine. Steps to follow:

  1. Ready installation can be done by: pip install rasa_nlu
  2. You can view the latest documentation here

RASA-NLU is made up of a few components, each doing some specific work (intent detection, entity extraction, etc.). Each component may have some specific dependencies and installations. Options like MITIE (NLP + ML), Spacy and Sklearn are available to choose from. We will be using Spacy-Sklearn here.

Client UI can be a web page (using frameworks like Flask in Python) or a mobile app. Flask is simple to code and runs locally. Use pip install flask and follow this tutorial to get a basic understanding of the framework.

 

Server

A RASA-NLU platform needs to be trained before we start using it. We need to supply it with a few sentences and mention which are the intents and entities in it. Intents are the actions/categories of the sentences and entities are the necessary variables needed to fulfil the actions.

For example, “I wish to book a flight from Mumbai to Pune on 27 March” has “flight-booking” as the intent and “Mumbai”,” Pune” and “27 March” as the entities. Similarly, many training examples can be used so that the RASA-NLU model is trained on different ways of extracting intents/entities from our domain conversations. This training data is stored in a json, a sample of which can be seen here.

It contains many entries. One of the sample entries is shown below:

      {
        "text": "show me a mexican place in the centre",
        "intent": "restaurant_search",
        "entities": [
          {
            "start": 31,
            "end": 37,
            "value": "centre",
            "entity": "location"
          },
          {
            "start": 10,
            "end": 17,
            "value": "mexican",
            "entity": "cuisine"
          }
        ]
}

 

Following is the explanation of some of the fields mentioned in the above code:

  • text: the input sentence.
  • intent: action or category, in this instance “restaurant_search”. This is typically the call-back function name.
  • entities: array of entities. Here, there are two. One is of type ‘location’ with value as ‘centre’, whereas the other is of type ‘cuisine’ with value ‘mexican’. ‘start’ and ‘end’ specify beginning and ending indices of the word in the sentence.

You can use the below online tool as well, to generate this json file:

https://rasahq.github.io/rasa-nlu-trainer/

 

Steps to build server side of the GST chat bot application:

  1. Create a new directory and navigate to it.
  2. Create the following things in it:
    1. data directory
    2. data/demo_gst.json: This has GST FAQ training examples as shown above
    3. json: This has settings for RASA-NLU as shown below:
      {
        "pipeline": "spacy_sklearn",
        "path" : "./",
        "data" : "./data/gstfaq-data.json"
      }
  1. Train the model in python
    • -m rasa_nlu.train -c config.json
  2. This will create a model_YYYYMMDD-HHMMSS folder
  3. Run:
    • -m rasa_nlu.server -c config.json --server_model_dirs=./model_YYYYMMDD-HHMMSS
    • This starts the RASA-NLU server and let’s us know the port, for instance
  4. Once the server starts running you can GET/POST via curl to post or use it as the HTTP server
  5. For HTTP server, add “-e luis” while running the server
  6. Then, in the browser, type: http://localhost:5000/parse?q=hello%20there
  7. The output is shown below:

We can now look at the remaining components of our GST FAQ bot.

 

Client

Our chatbot client UI has been built using Flask framework. It uses two html templates to render the UI (i.e. the chat window). The outer UI is built with base.html as shown below:

<!DOCTYPE HTML>
<html lang="en">
<head>
…
</head>
    <body>
         <h1 align="center">GST FAQ Chat</h1>
        <div class="container">
            {% block content %}{% endblock %}
        </div>
</body>
    <footer>
        {% block other_footers %}{% endblock %}
    </footer>
</html>

The content and other_footers blocks are defined in home.html as shown below:

{% block content %}
<div class="row">
    <div class="col-sm-6">
        <div class="row">
            <div class="chat_window">
                <ul class="messages"></ul>
                <div class="bottom_wrapper clearfix">
                    <div class="message_input_wrapper">
                        <input id="msg_input" class="message_input" placeholder="Say Hi to begin chat..." />
                    </div>
                    <div class="send_message">
                         <div class="text">send</div>
                    </div>
                </div>
            </div>
            <div class="message_template">
                <li class="message">
                    <div class="avatar"></div>
                    <div class="text_wrapper">
                        <div class="text"></div>
                    </div>
                </li>
            </div>
        </div>
    </div>
</div>
{% endblock %}


{% block other_footers %}
<script src="{{ url_for('static', filename="js/bind.js") }}"></script>
{% endblock %}

 

Engine

This is the heart of the chatbot. Based on the intent received from RASA-NLU, it dispatches the entities to the mapped call-back functions. The function in turn, depending on the entities, calls Knowledgebase to get the response. Once the response is received, it is sent back to the UI.

Knowledgebase can be as simple as a dictionary of questions and answers, or as sophisticated as one can imagine/require (like databases, internet sources, etc.). This article, being minimalistic for demonstration purposes, fetches pre-coded responses from the dictionary.

Let’s take a look at the sample dictionary:

intent_response_dict = {
    "intro": ["This is a GST FAQ bot. One stop-shop to all your GST related queries"],
    "greet":["hey","hello","hi"],
    "goodbye":["bye","It was nice talking to you","see you","ttyl"],
    "affirm":["cool","I know you would like it"],
    "faq_link":['You can check all the events here <a href="https://www.cbec.gov.in/resources//htdocs-cbec/deptt_offcr/faq-on-gst.pdf</a>']
}

The engine’s use of RASA-NLU for intent-entities extraction and dispatching call-backs can be seen below:

@app.route('/chat',methods=["POST"])
def chat():
    try:
        user_message = request.form["text"]
        response = requests.get("http://localhost:5000/parse",params={"q":user_message})
        response = response.json()
        response = response["topScoringIntent"]
        intent = response.get("intent")
        entities = response.get("entities")
        if intent == "gst-info":
            response_text = gst_info(entities
        elif intent == "gst-query":
            response_text = gst_query(entities)
        else:
            response_text = get_random_response(intent)
        return jsonify({"status":"success","response":response_text})
    except Exception as e:
        print(e)
        return jsonify({"status":"success","response":"Sorry I am not trained to do that yet..."})


User text is sent to the RASA-NLU server using http://localhost:5000/parse. Its response contains the intent and the entities. Depending on the intent, functions like gst-info and gst-query are called. Their responses are then sent back to the UI.

The source code for this app is available on github.

 

Our chatbot in action

Steps to operate:

1.       Start the RASA-NLU server by executing “run_server.bat” script. It loads the custom trained model and starts listening to port 5000

2.       Execute the Flash app, by running the localhost at the given port, say 8000.

3.       Start typing commands in the bottom chat window and click “Send”.

4.       Typed messages and their responses appear in the window above, as seen in the adjoining picture.

 

 

You can view the video demonstration here.

 

End Notes

This tutorial presents just a small example, demonstrating the potential to develop something full-fledged and practically useful. Our GST Q&A bot can be enhanced on various fronts, such as expansion of knowledgebase (i.e. number of questions and answers), better training to find more intents and entities, Natural Language Generation of the responses to have a human language feel, etc.

GST FAQ Bot is just one example of building an intuitive frontend using government information. With the availability of more APIs and open public data, we can build similar (if not better) bots for those databases. Imagine interacting with government departments using a chatty bot!

 

References

About the Author

Yogesh Kulkarni

Yogesh H. Kulkarni is currently pursuing full-time PhD in the field of Geometric modeling, after working in the same domain for more than 16 years. He is also keenly interested in data sciences, especially Natural Language Processing, Machine Learning and wishes to pursue further career in these fields.

Our Top Authors

  • Analytics Vidhya
  • Guest Blog
  • Tavish Srivastava
  • Aishwarya Singh
  • Ram Dewani
  • Faizan Shaikh
  • Aniruddha Bhandari

Download Analytics Vidhya App for the Latest blog/Article

45 thoughts on "Building a FAQ Chatbot in Python – The Future of Information Searching"

Hunaidkhan
Hunaidkhan says: January 22, 2018 at 12:37 pm
Superb and very informative article Yogesh. Reply
neel
neel says: January 22, 2018 at 5:23 pm
Great intro on chatbot for a novice like me! It would be great if you could give more insights on knowledge base. As here simple knowledge base has been used but if a smart chatbot need to be developed a good knowledge base would be required. Help me to explore and integrate better knowledge base. Do knowledge base need to be created manually or we can also come over it using some hack ( considering a domain specific chatbot)? Let say fro an e-comerce chatbot, what could be the probable knowledge base we can use? Kindly share your email, it would be really nice to connect with you. Reply
Debdipta Halder
Debdipta Halder says: January 23, 2018 at 2:36 pm
Hi, I do have a small question. I want to build a chatbot for not only FAQ but also for other conversations. It is a company specific chatbot. Currently I am planning on using tensorflow to achieve the goal using seq2seq algorithm for deep learning. My question is can we build a general chatbot which not only works with FAQs but also with normal conversations using RASA-NLU? Reply
Yogesh Kulkarni
Yogesh Kulkarni says: January 23, 2018 at 4:37 pm
Knowledge base can get as complex as one wants or needs. NLU layer just interprets what user types into "intent" and "entities", rest is all upto queries into the knowledge-base. Reply
Yogesh Kulkarni
Yogesh Kulkarni says: January 23, 2018 at 4:42 pm
NLU layer just interprets what user types into "intent" and "entities", Rest is all up to you. If you have a seq2seq model (rnn/lstm) that can predict the response (decoder ), given intent and entities (encoder), then that should work as well. Btw, would like to suggest you to look at RASA's latest core module. It models the next action...may be useful in what you are trying to do. Reply
monk_programmer
monk_programmer says: January 23, 2018 at 4:47 pm
I have build that thing what you are saying , it gives FAQ but also it chat like a normal chatbot , both functionalities using some deep learning and core nlp things. You can ping me at "monkprogrammer{}gmail{}com".format('@','.') if you want to know about it. Reply
Sunil M
Sunil M says: January 23, 2018 at 11:36 pm
Dear Yogesh, Thanks for the article. When I executed the code, it created model_YYYYMMDD-HHMMSS folder inside another new folder DEFAULT. When I try to start the server as below -m rasa_nlu.server -c config.json --server_model_dirs=./default/model_YYYYMMDD-HHMMSS getting following error server.py: error: unrecognized arguments: --server_model_dirs=./default/model_20180123-115527 Your inputs on the error will be helpful. Reply
Ganesh Jha
Ganesh Jha says: January 27, 2018 at 6:47 am
Nice Article Reply
Reshma
Reshma says: January 29, 2018 at 2:39 pm
Even I am getting same error. Did you find some solution for this? Reply
Yogesh Kulkarni
Yogesh Kulkarni says: January 29, 2018 at 7:24 pm
Please re-sync with the latest github code. Delete the existing model. Re-train by running the training script, it will create a new model. Run the latest/updated run_server script. Please let me know if you are still getting the error, after following the above steps. Reply
Shankar
Shankar says: January 31, 2018 at 9:38 pm
The error still exists. I have taken the latest code from git today. I checked in the code for server.py and there isnt an argument called "server_model_dirs". It just has the following : usage: server.py [-h] [-c CONFIG] [-e {wit,luis,dialogflow}] [-l {de,en}] [-m MITIE_FILE] [-p PATH] [--pipeline PIPELINE] [-P PORT] [-t TOKEN] [-w WRITE] Reply
Yogesh Kulkarni
Yogesh Kulkarni says: February 01, 2018 at 6:24 am
First delete the existing model. Create a new one by training using run_training.bat: "python -m rasa_nlu.train -c config.json" Config.json should be: { "pipeline": "spacy_sklearn", "path" : "models/", "data" : "data/gstfaq-data.json", "num_threads":10 } Then run run_server.bat , it should be "python -m rasa_nlu.server -c config.json -e luis" Reply
Shankar
Shankar says: February 01, 2018 at 1:18 pm
Thanks Yogesh, I have tried all this and its NOT throwing error now. I am using mac and there were issues linking with SPACY's en dictionary too (I had to download using python -m spacy download en ). Also there are some environment variable issues when using spacy so we need to initiate these 2 variables to initiate utf-8 locale : export LC_ALL=en_US.UTF-8 export LANG=en_US.UTF-8 For me the FLASK app isnt opening at all. The webpage which opens (http://127.0.0.1:5000/) is just static and has this written : 'hello from Rasa NLU: 0.10.6' . I used 5000 port as i got that from the output on terminal (from python -m rasa_nlu.server -c config.json -e luis) : INFO:__main__:Started http server on port 5000 2018-02-01 13:11:08+0530 [-] Log opened. 2018-02-01 13:11:08+0530 [-] Site starting on 5000 2018-02-01 13:11:08+0530 [-] Starting factory Can you tell me if there is something wrong which I am doing or some package which is not behaving as it should be ? Reply
Malar
Malar says: February 08, 2018 at 4:41 am
Nice article for a beginner. can u please let us know how to integrate the bot to facebook messenger and alexa? Reply
Yogesh Kulkarni
Yogesh Kulkarni says: February 08, 2018 at 9:06 am
Hard to tell just by this info. But I would suggest asking this on Rasa forum. Those folks are generally quick and helpful. Reply
Yogesh Kulkarni
Yogesh Kulkarni says: February 08, 2018 at 9:09 am
I wont be able to tell as I have not done those integrations myself. Reply
Tarun Mishra
Tarun Mishra says: March 06, 2018 at 1:07 pm
Very useful blog I got some relevant information how we can build faq chatbot in Python Reply
Arunvinodh
Arunvinodh says: March 09, 2018 at 12:34 pm
Is there any website uses ChatBot developed in RASA NLU?. Kindly reply with the link. Need to check the performance Reply
SELVA
SELVA says: March 13, 2018 at 4:26 pm
This article is very amazing one. i have built bot in above mentioned way. i think this handle context, but i don't know how to handle context? please explain context handling with any example. Reply
selvaganapathi
selvaganapathi says: March 13, 2018 at 4:32 pm
Yogesh Kulkarni, This article is nice one. Can you explain context handling in above method Reply
Yogesh Kulkarni
Yogesh Kulkarni says: March 13, 2018 at 6:02 pm
Rasa NLU is trained to identify intent and entities. Better the training, better the identification. Context gets captured better. As far as Slot-filling is concerned, that one has to do by our own logic. Rasa Core is now providing dialog manager to let you navigate the dialog states via stories. I would suggest going to tutorials at Rasa site. Reply
Payas Pandey
Payas Pandey says: March 18, 2018 at 6:49 pm
Great article, Is it a good idea to use "If-else" for response generation or "Rasa's Core Stories". If you can, Please state the pro and cons of both the approaches Reply
Sherice Ross
Sherice Ross says: March 21, 2018 at 12:17 am
Hi, How do I train the model in python? do i simply type "-m rasa_nlu.train -c config.json" in the command terminal? Reply
Sherice Ross
Sherice Ross says: March 21, 2018 at 12:32 am
Hi, I typed that into my command terminal and i keep getting an error. Why is that? C:\new directory\json>python Python 3.6.4 (v3.6.4:d48eceb, Dec 19 2017, 06:04:45) [MSC v.1900 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> -m rasa_nlu.train -c config.json File "", line 1 -m rasa_nlu.train -c config.json ^ SyntaxError: invalid syntax Reply
Sherice Ross
Sherice Ross says: March 21, 2018 at 12:32 am
Hi, I typed the commands below into my command terminal and i keep getting an error. Why is that? C:\new directory\json>python Python 3.6.4 (v3.6.4:d48eceb, Dec 19 2017, 06:04:45) [MSC v.1900 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> -m rasa_nlu.train -c config.json File "", line 1 -m rasa_nlu.train -c config.json ^ SyntaxError: invalid syntax Reply
Yogesh Kulkarni
Yogesh Kulkarni says: March 21, 2018 at 7:07 am
>>> python -m rasa_nlu.train -c config.json Can you run the batch files? Also, is your rasa installation proper? Reply
Parag Kulkarni
Parag Kulkarni says: March 22, 2018 at 8:42 pm
Hi what are prerequisite to learn this? I mean level of Python required... Reply
Yogesh Kulkarni
Yogesh Kulkarni says: March 23, 2018 at 7:04 am
Basic python and web framework such as flask will do. Reply
Sherice Ross
Sherice Ross says: March 23, 2018 at 12:46 pm
Hi, How do I train the model in python? do i simply type “-m rasa_nlu.train -c config.json” in the command terminal? Reply
Madhavi
Madhavi says: March 25, 2018 at 5:57 pm
GST info and GST query entity extraction is not working properly. I am running the code from GIThub. intro, greet and affirm are only working. Can you tell me if I am missing something? Thanks Reply
Justin
Justin says: March 25, 2018 at 8:12 pm
Hi Yogesh, This article was very informative and gave a headstart for me. I tried this out and worked!. Have just one question. How do we go about, if the bot wants to give an option or ask one question to the user, before answering the user's original question ? TA Reply
Yogesh Kulkarni
Yogesh Kulkarni says: March 26, 2018 at 7:30 am
This GSTFaq chatbot is a basic one. One can add many more sophistication on top of it, like the one you mentioned. I would suggest going through Rasa-NLU and Rasa-Core tutorials on their site. Reply
Yogesh Kulkarni
Yogesh Kulkarni says: March 26, 2018 at 7:33 am
Can you debug to see if Rasa is returning proper intent and then checking further query should be straightforward. Reply
AD
AD says: April 07, 2018 at 3:15 pm
Thank you Yogesh! I configured the Rasa and the flesk also. I trained using "python -m rasa_nlu.train -c config.json --data=./models/default/" , as it was throwing error. Now when I start the chatbot (Flesk) it shows following error. If you could throw some light on the same. "Sorry I am not trained to do that yet..." Reply
Yogesh Kulkarni
Yogesh Kulkarni says: April 08, 2018 at 6:16 am
Need to make sure that the models are created/present in the path ./models/default/ I would suggest going through tutorials at Rasa-NLU first http://rasa-nlu.readthedocs.io/en/latest/tutorial.html Reply
Ramprasad
Ramprasad says: April 21, 2018 at 6:37 pm
Thank you. Please post chatbot model to train and use using tensorflow in python. Reply
Ben Gung
Ben Gung says: April 30, 2018 at 11:47 pm
Hi Yogesh, Nice article! But I cannot access the tutorial on this site: http://rasa-nlu.readthedocs.io/en/latest/tutorial.html Permission denied. I am trying to read the tutorial from the US. How do I gain access to your site? Thank you! Ben Reply
Yogesh Kulkarni
Yogesh Kulkarni says: May 01, 2018 at 4:46 am
Rasa NLU may not be hosting their tutorials there anymore. Googling for the same, I found https://nlu.rasa.com/tutorial.html Reply
Sunil Kumawat
Sunil Kumawat says: May 19, 2018 at 12:03 pm
Hi Yogesh, I am working on a project of making a faq chatbot for a small finance bank. I have basic knowledge of python. please suggest me the procedure to make one by indicating what I need to learn. Thank you. Reply
Yogesh Kulkarni
Yogesh Kulkarni says: May 19, 2018 at 5:00 pm
The article itself is step by step procedure to make a chatbot Reply
MikeA
MikeA says: June 01, 2018 at 8:37 pm
Just a guess but I think the training command is rather `$ python -m rasa_nlu.train -d data/ -c config.json` which provide the path for the data to train the model. Therefore, what is the argument for starting the RASA-NLU server and let’s us know the port, for instance ? I tried `python -m rasa_nlu.server --path -d data/ -c config.json --server_model_dirs=./models/default/model_YYYY????-????/` Reply
MikeA
MikeA says: June 01, 2018 at 8:42 pm
I think it lacks "python" at the beginning. I don't know why he didn't added it. Reply
MikeA
MikeA says: June 01, 2018 at 8:55 pm
I have the same error. Why was "-m rasa_nlu.server -c config.json –server_model_dirs=./default/model_YYYYMMDD-HHMMSS expecting for an argument then ?" Don't you have to add pyhton at the beginning ? Am I the only one who thinks that "python -m rasa_nlu.train -c config.json" command lacks the path to the data necessary to train it ? Reply
Anonymous
Anonymous says: June 07, 2018 at 5:07 pm
Hi, For me while training new model folder is not getting created. The message is like this: /home/1099187/anaconda3/lib/python3.6/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples. 'precision', 'predicted', average, warn_for) [Parallel(n_jobs=1)]: Done 12 out of 12 | elapsed: 0.2s finished Reply
TT
TT says: August 08, 2018 at 3:17 pm
Hey are you able to solve the error? please tell how to solve it Reply

Leave a Reply Your email address will not be published. Required fields are marked *