Hack Session: Intent Identification for Indic Languages

Nov 13, 2019


Auditorium 2

60 minutes

Natural Language Processing

In the age of smart devices, accurately detecting Intent of the user from natural language utterance is one of the fundamental problem to be solved in order to truly move from clicks to conversations. For example – If a machine wants to seamlessly offer customer support experience for a telecom domain to Indian users, then it needs to understand granular differences between sentences such as “मुझे सारे इंटरनेट प्लान्स दिखाओ”, “मुझे मेरा इंटरनेट प्लान दिखाओ “, “मेरा इंटरनेट प्लान चल नहीं रहा”, “मेरा इंटरनेट प्लान कब वैलिड है “. This talk will primarily focus on solving this problem for low resource languages.


Structure of the Hack Session

  • Introduction to the problem and motivation
  • Overview of multiple approaches and their respective pros and cons – Simple Ranking based on cosine similarity, Short text similarity / Textual entailment, Multi-label Classification, Using translation systems.
  • Introduction to Dataset for intent identification and walkthrough over different intents in training set.
  • Build an intent identification system using public datasets and training set for different intents.
  • Analyse the system for granularities such as usage of colloquial words ( “करो” , “कर दो” , “करनेका है”, etc.), tense changes (“मेरा रिचार्ज करो” , “मेने रिचार्ज कर दिया”), presence of negations, fragmented or broken user inputs, presence of multiple user intents in one utterance, etc.
  • Challenges in existing system and future scope of work.


Key Takeaways

  • Understanding of granular problems and challenges of intent identification
  • Different approaches to solve the problem
  • Exposure to available public datasets for Indic languages and it’s utility

Check out the below video to know more about the session.

  • Krupal Modi

    Director of Machine Learning


Copyright 2019 Analytics Vidhya. All rights reserved