Learn everything about Analytics

A Comprehensive Learning Path to Understand and Master NLP in 2020

Introduction

Google “NLP jobs” and a remarkable number of relevant searches show up. There are businesses spinning up around the world that cater exclusively to Natural Language Processing (NLP) roles! The industry demand for NLP experts has never been higher – and this is expected to increase exponentially in the next few years.

But the supply side of things is falling short. Freshers and even experienced folks who want to land an NLP based role are struggling to break into the industry. We can pinpoint one of the biggest pain areas – a lack of structured learning.

There are far too many resources these days that cover NLP concepts but the majority of these do so in a scattershot manner. Freshers tend to pour through articles and books, parse various blogs and videos, and end up struggling to piece together an end-to-end understanding.

This is where our NLP learning path comes in! We are thrilled to present a comprehensive and structured learning path to help you learn and master NLP from scratch in 2020!

nlp_learning_path

This learning path has been curated by experts at Analytics Vidhya who have gone through hundreds of resources to curate this for our community. Follow this path in 2020 and you’ll be on the verge of landing a role in the NLP domain soon!

 

Our Framework for the NLP Learning Path

Structure – that’s at the heart of everything we do. Our learning paths are popular for their structure as well as their comprehensive nature. Here’s how we’ve broken down each month of the NLP learning path to help you plan your learning journey:

  • Objective: What will you learn in that month? What are the key takeaways? How will your NLP journey progress? We mention this at the start of each month to ensure you know where you stand and where you will be at the end of that particular month
  • Time Suggested: How much time on average you should spend on that section per week
  • Resources to Learn: A collection of the top resources for the NLP topics you will learn in that month. This includes articles, tutorials, videos, research papers, and other similar resources

Looking for other learning paths in data science? Your wait is over:

Let’s dive into it!

 

Month 0 – Prerequisites (Optional)

Objective: This is for all of you who are not yet familiar with Python and Data Science. By the end of this month, you should have a fair idea about the building blocks of machine learning and how to program in Python.

nlp_learning-path

Time Suggested: 6 hours/week

Python for Data Science:

Learn Statistics:

Data Preparation:

  • Training and Testing:

Linear Regression:

 

Logistic Regression:

 

Decision Tree Algorithm:

 

K-fold Cross-Validation:

 

Singular Value Decomposition (SVD):

 

Month 1 – Getting Comfortable with Text Data

Objective: And off we go! This month is all about getting you familiar and comfortable with the basic text preprocessing techniques. You should be able to build a text classification model by the end of this section.

nlp-learning-path

Time Suggested: 5 hours/week

Load Text Data from Multiple Sources:

Learn to use Regular Expressions:

Text Preprocessing:

Exploratory Analysis of Text Data:

Extract Meta Features from Text:

Project:

  • Build a Text Classification model using Meta Features. You can use the dataset from the practice problem Identify the Sentiments

 

Month 2 – Computational Linguistics and Word Vectors

Objective: This month you will start to see the magic of NLP. You will learn how English grammar can be utilized to extract key information from text. You will also work with word vectors, an advanced technique to create features from text.

computational-linguistics

Time Suggested: 5 hours/week

Extract Linguistic Features:

  • Part-of-Speech Tagging using spaCy:

  • Named Entity Recognition using spaCy:

  • Dependency Parsing by Stanford:

 

Text Representation in Vector Space:

Topic Modeling:

 

Information Extraction:

Projects:

  • Build Sentiment Detection Model using Word Embeddings. You can use the dataset from the practice problem Identify the Sentiments
  • Categorize News Articles using Topic Modeling

 

Month 3 – Deep Learning Refresher for NLP

Objective: Deep learning is at the heart of recent developments and breakthroughs in NLP. From Google’s BERT to OpenAI’s GPT-2, every NLP enthusiast should at least have a basic understanding of how deep learning works to power these state-of-the-art NLP frameworks. So this month, you will focus on the concepts, algorithms, and tools around Deep Learning.

Source: Tryolabs

Time Suggested: 5 hours/week

Neural Networks:

Optimization Algorithms:

Recurrent Neural Networks (RNNs) and LSTM:

  • A friendly introduction to RNNs:

Introduction to PyTorch:

 

Month 4 – Deep Learning Models for NLP

Objective: Now that you have a taste of deep learning and how it applies in the NLP context, it’s time to take things up a notch. Dive into advanced deep learning concepts like Recurrent Neural Networks (RNNs), Long Short Term Memory (LSTM), among others. These will help you gain a mastery of industry-grade NLP use cases.

Time Suggested: 5 hours/week

Recurrent Neural Networks (RNNs) for Text Classification:

CNN Models for NLP:

Projects:

  • Build a model to find named entities in the text using LSTM. You can get the dataset from here

 

Month 5 – Sequential Modeling

Objective: In this month, you will learn to use sequential models that deal with sequences as inputs and/or outputs. A very useful concept in NLP as you’ll soon discover!

sequence_nlp_1

Time Suggested: 5 hours/week

Language Modeling:

  • Language Models and RNNs by Stanford:

Sequence-to-Sequence Modeling:

Projects:

  • Train a language model on Enron Email dataset to build an auto-completion system
  • Build a Neural Machine Translation Model (English to any language of your choice)

 

Month 6 – Transfer Learning in NLP

Objective: Transfer learning is all the rage in NLP at the moment. This has actually helped democratize the state-of-the-art NLP frameworks you would have come across before. This month introduces BERT, GPT-2, ULMFiT and Transformers.

Time Suggested: 5 hours/week

ULMFiT:

Transformers:

Pre-trained Large Language Models (BERT and GPT-2):

Fine-Tuning pre-trained Models:

 

Month 7 – Chatbots and Audio Processing

Objective: You will learn how to build a chatbot or conversational agent this month. Once you have mastered NLP, the next frontier you can tackle is Audio Processing.

Time Suggested: 5 hours/week

Chatbots:

  • Rasa Masterclass:

Audio Processing:

Project:

  • Build a chatbot with voice interface using Rasa

 

Infographic – NLP Learning Path for 2020

Our community loves the infographics we design for each learning path. These infographics serve two primary purposes:

  • They help us visualize the structure of how we’ll learn different topics
  • They can be used as checklists to tick off concepts as you progress in your NLP journey

So, we’re thrilled to present below the NLP learning path infographic for 2020! You can download a high-resolution version from here.

You can also read this article on Analytics Vidhya's Android APP Get it on Google Play

Download

Download