Workshop: Applied Natural Language Processing

Nov 16, 2019

09:30

Hotel La Marvella, Bengaluru

Interacting with artificial intelligent systems seems a bit simulated at times. This is because the way we converse as humans to one another is completely different from that we do usually with AI systems. Thankfully, research has been rampant in the area to bridge the gap in conversational AI systems. In this 8-hour workshop, you will get to know about natural language processing, creating word embeddings and developing learners to perform NLP tasks like sentiment analysis, auto-correction and much more.  

Structure of the Workshop

  1. Introduction to Natural Language Processing
  2. Text pre-processing and Wrangling
    • Removing HTML tagsnoise
    • Removing accented characters
    • Removing special characterssymbols
    • Handling contractions
    • Stemming
    • Lemmatization
    • Stop word removal
  3. Project: Build a duplicate character removal module
  4. Project: Build a spell-check and correction module
  5. Project: Build an end-to-end text pre-processor
  6. Text Understanding
    • POS (Parts of Speech) Tagging
    • Text Parsing
      • Shallow Parsing
      • Dependency Parsing
      • Constituency Parsing
    • NER (Named Entity Recognition) Tagging
  7. Project: Build your own POS Tagger
  8. Project: Build your own NER Tagger
  9. Text Representation – Feature Engineering
    • Traditional Statistical Models – BOW, TF-IDF
    • Newer Deep Learning Models for word embeddings – Word2Vec, GloVe, FastText
  10. Project: Similarity and Movie Recommendations
  11. Project: Interactive exploration of Word Embeddings
  12. Case Studies for other common NLP Tasks
    • Project: Sentiment Analysis using unsupervised learning and supervised learning (machine and deep learning)
    • Project: Text Clustering (grouping similar movies)
    • Project: Text Summarization and Topic Models
  13. Promise of Deep Learning for NLP, Transfer and Generative Learning
  14. Final words and where to go from here?

Key Takeaways:

  • Learn and understand popular NLP workflows with interactive examples
  • Covers concepts and interactive projects on cleaning and handling noisy unstructured text data including duplicate checks, spelling corrections and text wrangling
  • Build your own POS and NER taggers and parse text data to understand it better
  • Understand, build and explore text semantics and representations with traditional statistical models and newer word embedding models
  • Projects on popular NLP tasks including text classification, sentiment analysis, text clustering, summarization, topic models and recommendations
  • Brief coverage of the promise of deep learning for NLP

System requirements:

  • Standard system with 4-8GB RAM,
  • 2-4 core processor(i5i7AMD),
  • GPU preferred for some deep learning tasks but not essential,
  • WindowsLinuxMac OS.
  • Cloud based services like AWS EC2 also work fine.
Notes:
  • Participants needs to carry their laptop for the workshop.
  • Anaconda distribution (Python 3.6) preferred with the following libraries pre-installed: nltk, spacy, TextBlob scikit-learn, numpy, pandas, keras, tensorflow

Prerequisites:

  • System Setup
    • Laptop with at least 8 GB of RAM
    • Install Anaconda (Resource)
    • Install the following packages:
      • Seaborn
      • NLTK 3.4
      • Wordcloud 1.5.0
      • Pyspellchecker 0.5.2
      • Torch
      • Gensim
      • Keras
      • TensorFlow
      • Plotly 4.0.0
      • Transformers (Resource)
      • Spacy 2.2.1 (Resource)
  • We will be using laptops for most part and will use Google Colab for deep learning part, hence make sure you have a google login and space on your google drive.
  •  Pre-Reads
 
Venue :- Hotel La Marvella - A Sarovar Premiere Hotel, Bangalore South-End Circle 2nd block, No 1, 15th Cross Rd, 2nd Block, Jayanagar, Bengaluru, Karnataka
Map :- goo.gl/maps/hwH5hEcA9K92
  • Sudalai Rajkumar (SRK)

    Data Scientist

    H2O.ai

Copyright 2019 Analytics Vidhya. All rights reserved