DescriptionProgram StructureEligibilityFacultyContact

In this online course you will be introduced to the essential techniques of text mining, understood here as the extension of data mining’s standard predictive methods to unstructured text. This course will discuss these standard techniques, and will devote considerable attention to the data preparation and handling methods that are required to transform unstructured text into a form in which it can be mined

After completing this course students will be able to:

  • Perform tokenization and create dictionaries to prepare text for classification tasks
  • Create numerical vectors from text data
  • Build classifiers with decision trees, Naive Bayes and linear models, using training and validation data
  • Perform “tagging” of text data
  • Cluster documents using the k-means algorithm
  • Generate predicted Twitter hash tags for text data

Course Structure

  • Week 1: Introduction and Data Preparation
  • Week 2: Predictive Models for Text
  • Week 3: Retrieval and Clustering of Documents
  • Week 4: Information Extraction


June 10, 2016 to July 08, 2016

Duration: 4 Weeks

Time Requirement:

About 15 hours per week, at times of  your choosing.

Fees: INR 32,940 (assuming $= INR 60)

Full Time/Part Time:

Part Time

Who Should Take This Course:

IT professionals, web marketing analysts, data mining and statistical consultants. In general: analysts and researchers who need to pilot, implement or analyze data mining methods aimed at data containing unstructured text (forms, surveys, etc.).

  • Anurag Bhardwaj
  • NitinIndurkhya
Name :
Email :
Contact Number :
Message :
Code :