Workshop: Applied Natural Language Processing

Interacting with artificial intelligent systems seems a bit simulated at times. This is because the way we converse as humans to one another is completely different from that we do usually with AI systems. Thankfully, research has been rampant in the area to bridge the gap in conversational AI systems. In this 8-hour workshop, you will get to know about natural language processing, creating word embeddings and developing learners to perform NLP tasks like sentiment analysis, auto-correction and much more.

Structure of the Workshop

Introduction to Natural Language Processing
Text pre-processing and Wrangling
- Removing HTML tagsnoise
- Removing accented characters
- Removing special characterssymbols
- Handling contractions
- Stemming
- Lemmatization
- Stop word removal
Project: Build a duplicate character removal module
Project: Build a spell-check and correction module
Project: Build an end-to-end text pre-processor
Text Understanding
- POS (Parts of Speech) Tagging
- Text Parsing
  - Shallow Parsing
  - Dependency Parsing
  - Constituency Parsing
- NER (Named Entity Recognition) Tagging
Project: Build your own POS Tagger
Project: Build your own NER Tagger
Text Representation – Feature Engineering
- Traditional Statistical Models – BOW, TF-IDF
- Newer Deep Learning Models for word embeddings – Word2Vec, GloVe, FastText
Project: Similarity and Movie Recommendations
Project: Interactive exploration of Word Embeddings
Case Studies for other common NLP Tasks
- Project: Sentiment Analysis using unsupervised learning and supervised learning (machine and deep learning)
- Project: Text Clustering (grouping similar movies)
- Project: Text Summarization and Topic Models
Promise of Deep Learning for NLP, Transfer and Generative Learning
Final words and where to go from here?

Key Takeaways:

Learn and understand popular NLP workflows with interactive examples
Covers concepts and interactive projects on cleaning and handling noisy unstructured text data including duplicate checks, spelling corrections and text wrangling
Build your own POS and NER taggers and parse text data to understand it better
Understand, build and explore text semantics and representations with traditional statistical models and newer word embedding models
Projects on popular NLP tasks including text classification, sentiment analysis, text clustering, summarization, topic models and recommendations
Brief coverage of the promise of deep learning for NLP

System requirements:

Standard system with 4-8GB RAM,
2-4 core processor(i5i7AMD),
GPU preferred for some deep learning tasks but not essential,
WindowsLinuxMac OS.
Cloud based services like AWS EC2 also work fine.

Notes:

Participants needs to carry their laptop for the workshop.
Anaconda distribution (Python 3.6) preferred with the following libraries pre-installed: nltk, spacy, TextBlob scikit-learn, numpy, pandas, keras, tensorflow

Prerequisites:

System Setup
- Laptop with at least 8 GB of RAM
- Install Anaconda (Resource)
- Install the following packages:
  - Seaborn
  - NLTK 3.4
  - Wordcloud 1.5.0
  - Pyspellchecker 0.5.2
  - Torch
  - Gensim
  - Keras
  - TensorFlow
  - Plotly 4.0.0
  - Transformers (Resource)
  - Spacy 2.2.1 (Resource)
We will be using laptops for most part and will use Google Colab for deep learning part, hence make sure you have a google login and space on your google drive.
Pre-Reads
- Programming knowledge in Python (Resource)
- Basics of Machine learning (Resource)
- Text Cleaning (Resource)
- Google Colab Intro (Resource)

Venue :- Hotel La Marvella - A Sarovar Premiere Hotel, Bangalore South-End Circle 2nd block, No 1, 15th Cross Rd, 2nd Block, Jayanagar, Bengaluru, Karnataka
Map :- goo.gl/maps/hwH5hEcA9K92

SHOW INTEREST

Workshop: Applied Natural Language Processing

Structure of the Workshop

Key Takeaways:

System requirements:

Prerequisites:

Sudalai Rajkumar (SRK)