Applied Natural Language Processing

Nov 16, 2019

09:30

Prerequisites

Python programming experience Basic Knowledge of machine learning

Interacting with artificial intelligent systems seems a bit simulated at times. This is because the way we converse as humans to one another is completely different from that we do usually with AI systems. Thankfully, research has been rampant in the area to bridge the gap in conversational AI systems. In this 8-hour workshop, you will get to know about natural language processing, creating word embeddings and developing learners to perform NLP tasks like sentiment analysis, auto correction and much more.

Prerequisites for the workshop:

  • Python programming experience
  • Basic Knowledge of machine learning

Structure of the Workshop

  1. Introduction to Natural Language Processing
  2. Text pre-processing and Wrangling
    • Removing HTML tagsnoise
    • Removing accented characters
    • Removing special characterssymbols
    • Handling contractions
    • Stemming
    • Lemmatization
    • Stop word removal
  3. Project: Build a duplicate character removal module
  4. Project: Build a spell-check and correction module
  5. Project: Build an end-to-end text pre-processor
  6. Text Understanding
    • POS (Parts of Speech) Tagging
    • Text Parsing
      • Shallow Parsing
      • Dependency Parsing
      • Constituency Parsing
    • NER (Named Entity Recognition) Tagging
  7. Project: Build your own POS Tagger
  8. Project: Build your own NER Tagger
  9. Text Representation – Feature Engineering
    • Traditional Statistical Models – BOW, TF-IDF
    • Newer Deep Learning Models for word embeddings – Word2Vec, GloVe, FastText
  10. Project: Similarity and Movie Recommendations
  11. Project: Interactive exploration of Word Embeddings
  12. Case Studies for other common NLP Tasks
    • Project: Sentiment Analysis using unsupervised learning and supervised learning (machine and deep learning)
    • Project: Text Clustering (grouping similar movies)
    • Project: Text Summarization and Topic Models
  13. Promise of Deep Learning for NLP, Transfer and Generative Learning
  14. Final words and where to go from here?

Key Takeaways:

  • Learn and understand popular NLP workflows with interactive examples
  • Covers concepts and interactive projects on cleaning and handling noisy unstructured text data including duplicate checks, spelling corrections and text wrangling
  • Build your own POS and NER taggers and parse text data to understand it better
  • Understand, build and explore text semantics and representations with traditional statistical models and newer word embedding models
  • Projects on popular NLP tasks including text classification, sentiment analysis, text clustering, summarization, topic models and recommendations
  • Brief coverage of the promise of deep learning for NLP

System requirements:

  • Standard system with 4-8GB RAM,
  • 2-4 core processor(i5i7AMD),
  • GPU preferred for some deep learning tasks but not essential,
  • WindowsLinuxMac OS.
  • Cloud based services like AWS EC2 also work fine.
Notes:
  • Participants needs to carry their laptop for the workshop.
  • Anaconda distribution (Python 3.6) preferred with the following libraries pre-installed: nltk, spacy, TextBlob scikit-learn, numpy, pandas, keras, tensorflow
BUY NOW
  • Sudalai Rajkumar (SRK)

    Data Scientist

    H20.ai

    BIO

    Sudalai Rajkumar (aka SRK) is a Data Scientist at H2O.ai Inc, building Driverless AI, an automated machine learning platform. He is currently leading the NLP efforts for this platform. Prior to this, he has done both data science consulting and ML product development roles. In his 9+ years of experience, he has solved a lot

Copyright 2019 Analytics Vidhya. All rights reserved