A Must-Read Introduction to Sequence Modelling (with use cases)

Tavish Srivastava 31 May, 2020

8 min read

Introduction

Artificial Neural Networks (ANN) were supposed to replicate the architecture of the human brain, yet till about a decade ago, the only common feature between ANN and our brain was the nomenclature of their entities (for instance – neuron). These neural networks were almost useless as they had very low predictive power and less number of practical applications.

But thanks to the rapid advancement in technology in the last decade, we have seen the gap being bridged to the extent that these ANN architectures have become extremely useful across industries.

In this article, we will look at the two main advances in the field of artificial neural networks that have made these ANNs more like the human brain,

Two Main Advances in the Field of ANN

GPUs have immensely improved our computational power that now enables us to vastly increase the depth and breadth of neurons. However, we are still far away from reaching the number of neurons our brain has.
ANN can now process sequence data in both input and output nodes. This is how our brain works. Our brain does not solve binary classification to understand complex ideas. We formulate “Thoughts” based on a sequence of information given to us and then our brain expresses this “Thought” in understandable sequence of words.

Can we introduce this concept of “Thought” in an ANN? The answer is yes, and we will explore more about the idea in this article.

Sequence models have garnered a lot of attention because most of the data in the current world is in the form of sequences – it can be a number sequence, image pixel sequence, a video frame sequence or an audio sequence.

Over the last 10 years, we have stored 1000s of Petabytes (or more than 10 ^ 9 GBs) of unstructured sequence data for absolutely no reason as we had no way to fetch information out of such data formats. Luckily, we now have this new family of neural network architectures called sequence models that can turn this data dump into GOLD MINES.

The scope of this article is not to talk about all the complex mathematics that goes behind the scene in Sequence Modelling or give you some sample codes to run on sequence modelling (I will park that for some later articles), but to give you practical examples of sequence modelling implementations in the industry. These will enable you to identify business problems in your industry that might need this special tool.

To get a better understanding of what this article is about, below is a scenario which I want you to imagine. Put your analytical thinking hats on!

Thought Experiment

Walmrt has appointed you as the head of it’s new vertical – WalKiosk. The company wants you to lead the development of a self servicing (human-less) store where a customer will only interact with Walmrt’s Kiosk, which is very similar to a vending machine. They want to install this Kiosk in various locations across the United States.

A key difference between this Kiosk and a normal vending machine is that the Kiosk’s display does not show the list of items, but simply an audio enabled Google-like search tab. The customer can literally walk up to these Kiosks, and say or type anything after the keyword “OK Walmrt, xxxxxx”. Here is a sample interaction (try to evaluate if a human can do a better job than this Kiosk):

Customer says – “OK Walmrt, I want the shoes which Leonardo DiCaprio wore in the 1st scene of the 1st movie he did with Nolan” in any possible spoken language.

The idea is for the Kiosk to do a quick search and if it finds a convincing answer, it should reply, in the same language as the customer’s query, something like – “Leonardo DiCaprio wore black colored Nike shoes of model xxxxx. Click the link on the kiosk to watch a video cut of the scene you asked me to look at. Great news – we currently have the exact same shoe with the same size as you are wearing, and it’s cost is $200. As you are a loyal customer of Walmrt, I have found a steal deal for you! The new price of the shoe, if you buy it immediately, is $150 for you”.

If the customer says “I want to buy it”, the Kiosk dispenses the shoe once the customer makes the payment.

Kiosk finally replies – “Thanks Mr. XYZ for shopping with us today. Please give your valuable feedback for us to improve our service further.” Customer writes or says the feedback of this transaction and leaves.

This simple transaction, that will probably take a good chunk of your time in today’s world, will be resolved in less than 2 minutes (if everything works, that is).

Sounds futuristic? Here’s a spoiler – all the fancy next gen functional skills you need to build in this Kiosk will be done mainly by a single architecture – sequence modelling. Here is a small list of tasks the Kiosk needs to do:

Speech Recognition to understand what the customer is saying
Machine Language Translation from source language to a known language (say English)
Name entity/Subject extraction to find the main subject of the customer’s query translated in step 2
Relation Classification to tag relationships between various entities tagged in step 3
Path Query Answering (Similar to Google search) on entity-relationship found in step 3 & 4 using core knowledge graph
Speech Generation to generate answers for the customer with all the relevant information found in step 5
Chatbot skill to have conversational ability and engage with customers just like a human
Text Summarization of customer feedback to work on key challenges/pain points
Product Sales Forecasting to replenish stock

The skills required to create WalKiosk are not limited to these nine steps, but they are good enough to bring out the core idea. Each of these nine skills can be modeled by a single architecture – Sequence Modelling (but you already knew this).

You can imagine sequence modelling as a black box which stays almost the same; all you need to change is the input and target data for each of the nine skill sets. Leveraging the idea that all the model architectures in each step is the same, we can take this a step further and create a single model that takes input in any language and completes the self service process/reporting process/inventory management process all together.

If this was not enough to make you Google all about sequence modelling, let’s look at an exhaustive list of all functions sequence modelling is capable of.

Practical Applications of Sequence Modelling

To make sure we cover most of the possible applications of sequence modelling, we will categorize them based on the type of input and output sequences. Inputs and outputs can be one of the following: Scalar, Trend, Text, Image, Audio or Video. If each of these six can be both input and output, we have 36 categories in total. However, not each of these pairs has been explored in depth yet.

Before moving to the list, pause for a moment and create your own list of applications (you can use our thought experiment as a reference).

Here goes the list:

Reading the table is fairly straight forward:

Type is the category of input/target
Elements are the number of elements in input/target series
Use Cases are the possible applications in the category

We will review a few of these use cases in order to get a grasp of the superpowers that our sequence model possess.

First, let’s talk about the easiest of the lot – Sequence Generators

These generators generally take scalar inputs. The scalar input can be any random seed/number. Following are a few examples of generators:

Note that we can train our model on any specific type of data. For instance, if we train our text generator on a Harry Potter book, it is highly likely that you will get a text which is full of imagination/magic with the main character as Harry Potter. If you were lucky, you might get a chapter that makes sense and you can enjoy this privileged chapter that no one has access to!

Another example – if you train the model on Jazz music, you can create new songs in the same genre using this model. Yet another example – if you train the model on images of animals, you might see how cross breeds might look like.

Next, let’s talk about the favorites – Sequence to sequence NLP Models

Machine Language Translation has reached new heights and is now competing strongly with human translators. Today, you can find real-time translating machines which are based on the core concept of sequence to sequence models.

Text summarization is another important use case of sequence models. Text summarization can significantly reduce the task of manually reading lengthy customer complaints, monitoring compliance based call/chat monitoring, and reviewing customer feedback on product etc.

Chatbot is yet another important application and is now being widely used in Operations/Call Centers/Chat Centers/Personal assistants like Siri/Google Home/Alexa.

Finally, we will talk about a few more sequence to sequence models that go beyond text

Speech recognition is currently the category which has absorbed the maximum investment in terms of money. Speech recognition is extremely important in tools like personal AI assistants (Alexa, Google Home, etc.) and call center speech recording tools.

Currently we have billion dollar companies whose sole competency is speech recognition. Speech recognition also uses sequence to sequence models extensively. Image Captioning is one of the hottest research fields which has a wide application in the social media industry. Subtitle generation has not reached the stage of production yet, but is being actively researched.

End Notes

A lot of the data science talent today focuses its effort on solving problems that already exist. An equally important task, for any successful data scientist or analyst, is to identify and create new tasks that can be solved analytically. The latter is a very different exercise and does not need a lot of coding experience or mathematically background. All you need to know is what is possible and what is not, using a given tool.

Problem identification is a skill set that is a “must” for any senior analytics professional. I hope this introductory article on sequence learning gave you strong motivation to start searching for new problems in your industry that can be solved using this method.

If you have any ideas or suggestions regarding the topic, do let me know in the comments below!

Learn, engage , compete and get hired!

Tavish Srivastava 31 May, 2020

Tavish Srivastava, co-founder and Chief Strategy Officer of Analytics Vidhya, is an IIT Madras graduate and a passionate data-science professional with 8+ years of diverse experience in markets including the US, India and Singapore, domains including Digital Acquisitions, Customer Servicing and Customer Management, and industry including Retail Banking, Credit Cards and Insurance. He is fascinated by the idea of artificial intelligence inspired by human intelligence and enjoys every discussion, theory or even movie related to this idea.

Advanced Data Science Deep Learning Machine Learning NLP

Frequently Asked Questions

Responses From Readers

Ramprasad 21 Apr, 2018

Thank you. Please post simple chatbot model (train+use) implementation using tensorflow in python.

1

Show 1 reply

Aishwarya Singh 23 Apr, 2018

Hi Ramprasad, You can follow this link for TensorFlow's seq2seq model.

Srinath 28 Aug, 2018

Greetings!!, Thanks a ton for sharing the insights I liked the idea of not reinventing the models when we already have solutions to most of the problems is good point to start with when we are starting the journey in Data Science. I am currently working on converting free text to a cat log or bucket them into categories . Is there a way that you can help with my use case Would appreciate your help