Hack Session: Intent Identification for Indic Languages

In the age of smart devices, accurately detecting Intent of the user from natural language utterance is one of the fundamental problem to be solved in order to truly move from clicks to conversations. For example – If a machine wants to seamlessly offer customer support experience for a telecom domain to Indian users, then it needs to understand granular differences between sentences such as “मुझे सारे इंटरनेट प्लान्स दिखाओ”, “मुझे मेरा इंटरनेट प्लान दिखाओ “, “मेरा इंटरनेट प्लान चल नहीं रहा”, “मेरा इंटरनेट प्लान कब वैलिड है “. This talk will primarily focus on solving this problem for low resource languages.

Structure of the Hack Session

Introduction to the problem and motivation
Overview of multiple approaches and their respective pros and cons – Simple Ranking based on cosine similarity, Short text similarity / Textual entailment, Multi-label Classification, Using translation systems.
Introduction to Dataset for intent identification and walkthrough over different intents in training set.
Build an intent identification system using public datasets and training set for different intents.
Analyse the system for granularities such as usage of colloquial words ( “करो” , “कर दो” , “करनेका है”, etc.), tense changes (“मेरा रिचार्ज करो” , “मेने रिचार्ज कर दिया”), presence of negations, fragmented or broken user inputs, presence of multiple user intents in one utterance, etc.
Challenges in existing system and future scope of work.

Key Takeaways

Understanding of granular problems and challenges of intent identification
Different approaches to solve the problem
Exposure to available public datasets for Indic languages and it’s utility

Check out the below video to know more about the session.

SHOW INTEREST

Hack Session: Intent Identification for Indic Languages

Krupal Modi