speaker-info
GETTING STARTED WITH NATURAL LANGUAGE PROCESSING

Interacting with artificial intelligent systems seems a bit simulated at times. This is because the way we converse as humans to one another is completely different from that we do usually with AI systems.

Thankfully, research has been rampant in the area to bridge the gap in conversational AI systems. In this 8-hour workshop, you will get to know about natural language processing, creating word embeddings and developing learners to perform NLP tasks like sentiment analysis, auto correction and much more.

 
Prerequisites for the workshop:
 
  • Python programming experience
  • Basic Knowledge of machine learning
  • Has participated in data science competitions or worked on real life data science projects
 

Structure Of The Workshop

  1. Introduction to Natural Language Processing
  2. Text pre-processing and Wrangling
    • Removing HTML tags\noise
    • Removing accented characters
    • Removing special characters\symbols
    • Handling contractions
    • Stemming
    • Lemmatization
    • Stop word removal
  3. Project: Build a duplicate character removal module
  4. Project: Build a spell-check and correction module
  5. Project: Build an end-to-end text pre-processor
  6. Text Understanding
    • POS (Parts of Speech) Tagging
    • Text Parsing
      • Shallow Parsing
      • Dependency Parsing
      • Constituency Parsing
    • NER (Named Entity Recognition) Tagging
  7. Project: Build your own POS Tagger
  8. Project: Build your own NER Tagger
  9. Text Representation – Feature Engineering
    • Traditional Statistical Models – BOW, TF-IDF
    • Newer Deep Learning Models for word embeddings – Word2Vec, GloVe, FastText
  10. Project: Similarity and Movie Recommendations
  11. Project: Interactive exploration of Word Embeddings
  12. Case Studies for other common NLP Tasks
    • Project: Sentiment Analysis using unsupervised learning and supervised learning (machine and deep learning)
    • Project: Text Clustering (grouping similar movies)
    • Project: Text Summarization and Topic Models
  13. Promise of Deep Learning for NLP, Transfer and Generative Learning
  14. Final words and where to go from here?

Key Takeaways:
  • Learn and understand popular NLP workflows with interactive examples
  • Covers concepts and interactive projects on cleaning and handling noisy unstructured text data including duplicate checks, spelling corrections and text wrangling
  • Build your own POS and NER taggers and parse text data to understand it better
  • Understand, build and explore text semantics and representations with traditional statistical models and newer word embedding models
  • Projects on popular NLP tasks including text classification, sentiment analysis, text clustering, summarization, topic models and recommendations
  • Brief coverage of the promise of deep learning for NLP

System requirements: Standard system with 4-8GB RAM, 2-4 core processor(i5\i7\AMD), GPU preferred for some deep learning tasks but not essential, Windows\Linux\Mac OS. Cloud based services like AWS EC2 also work fine.

Notes: Participants needs to carry their laptop for the workshop.

Anaconda distribution (Python 3.6) preferred with the following libraries pre-installed: nltk, spacy, TextBlob scikit-learn, numpy, pandas, keras, tensorflow If you install Python 3.7 do remember that keras+tensorflow may not be available with a stable build

   

INSTRUCTORS



Dipanjan Sarkar



Dipanjan (DJ) Sarkar is a Data Scientist, a published author and a consultant and trainer. He has consulted and worked with several startups as well as Fortune 500 companies like Intel. He primarily works on leveraging data science, advanced analytics, machine learning and deep learning to build large- scale intelligent systems. He holds a master of technology degree with specializations in Data Science and Software Engineering. He is also an avid supporter of self-learning and massive open online courses. He plans to venture soon into the world of open-source products to improve the productivity of developers across the world. Dipanjan has been an analytics practitioner for several years now, specializing in machine learning, natural language processing, statistical methods and deep learning. Having a passion for data science and education, he also acts as an AI Consultant and Mentor at various organizations like Springboard, where he helps people build their skills on areas like Data Science and Machine Learning. He also acts as a key contributor and editor for Towards Data Science, a leading online journal focusing on Artificial Intelligence and Data Science. Dipanjan has also authored several books on R, Python, Machine Learning, Social Media Analytics, Natural Language Processing Deep Learning.





Raghav Bali



Raghav Bali is a Data Scientist at one the world’s largest healthcare organizations. His work involves research development of enterprise level solutions based on Machine Learning, Deep Learning and Natural Language Processing for Healthcare Insurance related use cases. In his previous role at Intel, he was involved in enabling proactive data driven initiatives using Natural Language Processing, Deep Learning and traditional statistical methods. He has also worked in the financial domain with American Express, solving digital engagement and customer retention use cases. Raghav has also authored multiple books with leading publishers, the recent one on latest in advancements in Transfer Learning research. Raghav has a master’s degree (gold medalist) in Information Technology from International Institute of Information Technology, Bangalore. Raghav loves reading and is a shutterbug capturing moments when he isn’t busy solving problems.

 
Buy Ticket
 
Social media & sharing icons powered by UltimatelySocial