Machine Learning at Scale using SparkML for Big Data

Wondering how to use machine learning at large scale? Join this curated workshop by our data science experts who will introduce Apache Spark to process a large amount of data and build different machine learning models using MLLib.


This workshop goes beyond building your expertise in applying different machine learning algorithms on huge data using MLLib. It focuses on how machine learning algorithms can be applied at scale to work on petabytes of data, to generate models used for predictions.


Prerequisites for the workshop:

  • Python programming experience
  • Basics of Machine Learning
  •  Hands-on experience in pandas



Structure of the workshop

This is an 8-hour workshop and includes the following modules:

  • Introduction to Spark
  • Installing and setting up Spark
  • Spark API’s in Scala, Python, R
  • Basic syntax of Spark
  • Read, process, aggregate, write data
  • Modeling framework using MLLib
  • Different ML algorithms
  • Feature engineering 
  • Evaluation metrics 
  • Mini-hack
  • AMA

Make sure you don’t miss this advanced workshop on SparkML for Big Data! Get your tickets today to access this full-day session.




Rohan Rao


Rohan Rao (a.k.a. ‘vopani’) currently works as a Senior Data Scientist at Paytm, building machine learning solutions for the organization. Prior to this, he has worked with multiple startups in the machine learning space across various industries, platforms and products.

He is a regular participant of hackathons and competitions, having won on several occasions, including the AV DataFest in April-2017, which ranks him #1 on the AV Leaderboard. He’s a Kaggle Grandmaster and ranks among the top-100 Kagglers in the world.

Apart from Data Science, he’s a 11-time National Champion in Sudoku and Puzzles, having represented India at the World Championships last 9 years.



Phani Srikanth


Phani Srikanth (a.k.a binga) works with the data science team at Reliance JioMoney where he overlooks all things related to data. From Product Analytics that help improve the product by collaborating with several teams to building Machine Learning solutions, he’s a part of JioMoney’s data initiatives.

He has also been an active member on Kaggle and is also a top-5 member on AV rankings.


Duration of Workshop: 8 hours


Venue: Keys Hotel Whitefield – Chamber 1 & 2 (name of the conference room) Plot No.6, 1st Phase Industrial Area, ITPL Road, Opp. Graphite India, Bengaluru, Karnataka 560048 (Maps)

Social media & sharing icons powered by UltimatelySocial