Introduction To Big Data With Apache Spark - UC BerkeleyX

DescriptionProgram StructureEligibilityToolsFacultyContact

This course will attempt to articulate the expected output of Data Scientists and then teach students how to use PySpark (part of Apache Spark) to deliver against these expectations. The course assignments include Log Mining, Textual Entity Recognition, Collaborative Filtering exercises that teach students how to manipulate data sets using parallel processing with PySpark.

This course covers advanced undergraduate-level material.

Important Date:

Starts June 1, 2015

Duration:

5 week

5 – 7 hours per week

Full time/Part time:

Part time

Programming background and experience with Python required. All exercises will use PySpark (part of Apache Spark), but previous experience with Spark or distributed computing is NOT required.

Python
Apache Spark
PySpark

Anthony D. Joseph

Name :
Email :
Contact Number :
Message :

Code :

Introduction to Big Data with Apache Spark – UC BerkeleyX- EDX