DataHack Radio #24: Exploring and Designing Chatbots with RASA’s Justina Petraitytė
Chatbots are the most common application of Natural Language Processing (NLP). Organizations are scrambling to integrate chatbots into their daily functions to enhance and personalize our experience. As a data science professional, I’m always curious about how these chatbots are built.
Rasa is one such open source framework that we can leverage to build our own chatbots. So we are delighted to have Rasa’s data scientist and Head of Developer Relations, Justina Petraitytė, on our DataHack Radio podcast!
Justina brings a diverse set of skills and expertise in the NLP domain. In this episode, Kunal and Justina discuss a broad range of topics, including:
- Justina’s background and first brush with NLP
- How data science works in the video gaming industry
- What Rasa is and how it works
- The latest NLP breakthroughs and future trends, among other things.
And subscribe to Analytics Vidhya’s DataHack Radio podcast on the below platforms:
Justina’s Background, First Role in Data Science & Start with Rasa NLP
“I’m a data scientist at heart.” – Justina Petraitytė
That quote by Justina essentially encapsulates her career in data science and NLP till date. She completed her Bachelor’s in Econometrics from Lithuania (her home country). This four-year stint gave her an introduction to data analytics, time series forecasting, and the other core components of data science.
Justina was exposed to the data world quite early on in her career (as early as her under-grad days). She worked in digital banking and Fintech as an intern before her first role as a Senior Data Analyst at a UK-based video game company. If that sounds like a dream job – you aren’t alone!
“Working at a video game company was like a data scientist’s dream come true because you have a really, really interesting dataset to work with.”
She worked on a wide spectrum of machine learning and data analytics related projects there. Take a moment to think about what you would do as a data scientist working with video game data. Now double check that with what Justina worked on:
- Clustering the players based on their behavioral profiles
- Building toxic language classifiers
- Designing recommendation engines, and so on.
I’m sure there’s a lot of overlap with what you imagined. Honestly, the incredible far-reaching impact and use of machine learning never ceases to amaze me.
This is where Justina’s journey with conversational AI (chatbots) started. She worked on a project where she built a bot to automate the analytical reports she used to make for the different department roles in her company (deployed on Slack). So, she improved her knowledge of machine learning and made her company more data-driven. Quite inspiring!
This, as you might have pieced together already, is where Justina came across Rasa. She started delving deeper into Rasa, wrote tutorials, attended meetups and continuously worked to improve her already blossoming skillset.
That was how she ended up with her current role at Rasa itself. Take note all you budding NLP folks.
The Fascinating Features of Data Science and NLP in Gaming
What were some of the challenging aspects Justina worked on in her gaming AI role? A question all of us were curious to know more about:
- Player retention: Figuring out how to keep users engaged by looking at various metrics within the game. This includes predicting when users will come back to the game. A really interesting project!
- Clustering players based on their behavioral profiles: How do the players play the game? What decisions do they make under different situations? This helps the gaming company improve the game. For instance, which players prefer 1 v 1 games? Which ones go for multiplayer games? And so on
- Identifying toxic comments: Gamers will understand why certain comments and words need to be edited out or removed from the comments. This is where the NLP aspect comes in (jokes, sarcasm, sentiment, etc.)
For those curious about it – Justina worked mostly with supervised machine learning techniques to build her models.
Introduction to Rasa
The Rasa Stack is a set of open-source NLP tools focused primarily on chatbots. It’s one of the most effective and time efficient tools to build complex chatbots in minutes. We love using Rasa at Analytics Vidhya.
Justina, as we saw above, started working on Rasa since it was open source and she was looking for a tool to start her chatbot learning. She was quickly won over by the incredible flexibility Rasa offered along with a very supportive user community.
“I was initially sceptical if Rasa would work as well as Google DialogFlow or wit.ai but when I tried it, it did work as well if not better than these tools!”
Interesting note here – we can use Rasa to build chatbots in regional languages as well. It isn’t just limited to English like the majority of chatbot platforms out there. A very welcome feature indeed!
Most approaches use pre-trained vectors to build chatbots. So the limitation of using only those word embeddings resticts the chatbot’s language to English only. Rasa tackles those challenges by using their own model (called TensorFlow embeddings model).
We have a couple of intuitive tutorials which you can use to start your own Rasa journey:
- Learn how to Build and Deploy a Chatbot in Minutes using Rasa
- Building a FAQ Chatbot in Python – The Future of Information Searching
Keeping up with the Latest Trends and Developments in Machine Learning and NLP
Justina’s role as the head of developer relations at Rasa means she needs to be on top of the latest developments in machine learning. And that, of course, extends to the NLP field.
“My personal rule is to learn something new all the time. Whether that’s a new technique, or brushing up my programming skills.”
You can see why Justina is such a successful data scientist. Continuously learning new things and upskiling yourself is a primary quality you must have. Combine that with a never-ending quest for knowledge? You have the perfect combination!
The folks at Rasa follow the same approach. They hold regular meetings to discuss the latest developments in NLP (and the field as a whole) where they tackle things like which features to prioritize first, etc.
Interesting Applications of Chatbots you Might Not have Thought of
Chatbots are now ubiqiotous. Businesses around the world are relying heavily on these conversational AI agents to propel them into today’s digital era. I’m sure most of you must have come across these chatbots in your daily routines:
- While ordering food
- Getting your queries resolved via customer support
- Finding resolutions to your financial/insurance queries, and so on.
But there are other applications of chatbots that might have espaced our attention. Justina provided two excellent examples of these:
- Healthcare: A developer used Rasa to match people who need therapy with the therapists who could help them
- Education: Developers are using Rasa to build a system for primary school kids to help them complete their tasks in a better manner. This system also enables parents to track their child’s performance
- Gaming: Yes, videos games! Instead of raising a new ticket everytime you encounter a glitch, game developers are aiming to resolve the issue within the game itself using a chatbot assistant
The Future of Chatbots
“In general, the perception of what an AI assistant is and what it should be able to do is changing.”
Justina believes the current go-to approach for building chatbots is quite limited. You can design a chatbot using a set of rules to make it converse in a natural and polite way. But there’s scope of expanding beyond that. Essentially, there’s no “intelligence” aspect in play.
Making machines understand the context of a sentence is tricky. But we are starting to break down that barrier with breakthroughs like Google’s BERT, OpenAI’s GPT-2 and Transformer-XL.
Justina firmly believes we are soon going to see chatbots evolve to become more contextual and personalized. The fine folks at Rasa are already working on this. They want the chatbot to:
- Remember the previous conversation
- Understand the context of the situation
- Figure out what the user likes
- And then formulate a reply
Another great addition to the DataHack Radio podcast series. I really liked the diverse set of skills Justina brought to this episode. She is clearly an NLP master and is an eloquent speaker. Her ability to break down seemingly complex topics into non-technical terms is something I really appreciated.
What was your favorite aspect on this podcast? And is there any area of guest you would love to hear from on DataHack Radio? Let me know in the comments section below and let’s get talking!