[su_tabs]
[su_tab title = “Description”]
In this online course, you will expand on the topics from the Introduction to Analytics using Hadoop course, and introduce statisticians and data analysts to higher-order tools in the Hadoop Ecosystem
[/su_tab]
[su_tab title = “Program Structure”]
In this course, you will learn about:
- The software components of the Hadoop Ecosystem
- Data loading, warehousing and manipulation with HBase, Hive, and Sqoop
- Data aggregation and designing data workflows with Pig and Cascading
- Machine learning and data mining with Mahout
Course Program:
- Week 1: The Hadoop Ecosystem and Data Warehousing and Manipulation pt. 1
- Week 2: Data Warehousing and Manipulation pt. 2
- Week 3: Higher Order Hadoop Programming
- Week 4: Machine Learning and Data Mining
Important Date:
January 23, 2015 to February 20, 2015
Duration:
4 Weeks
Time Requirement:
About 15 hours per week, at times of your choosing.
Fees:
INR 32,940 (assuming $ = INR 60)
Part Time/Full Time:
Part Time
[/su_tab]
[su_tab title = “Eligibility”]
Data scientists and statisticians who are familiar with Hadoop fundamentals, have programming experience, and who want to learn how to process and analyze large data sets with Hadoop’s distributing computing capability and ecosystem components
Pre-requisite:
You needed the background of:
- Analytics with Hadoop or equivalent familiarity with Hadoop and its core components
- Strong understanding of MapReduce and MapReduce API
- Intermediate familiarity with Java preferred
- “SQL and R
- Basic knowledge of operating systems (UNIX/Linux)
[/su_tab]
[su_tab title =”Tools”]
- Apache Hadoop
- Java
- HBase
- Hive
- Sqoop
- MapReduce
- R
[/su_tab]
[su_tab title = “Faculty”]
- Jenny Kim
[/su_tab]
[su_tab title = “Contact”]
[/su_tab]
[/su_tabs]