DescriptionProgram StructureEligibilityContact

In this course you will learn the key fundamentals required for processing data on Hadoop cluster along with how to import data and how to choose the storage format for different data usage patterns.

  • Introduction to Apache Hadoop and the Hadoop Ecosystem
  • Apache Hadoop File Storage
  • Data Processing on an Apache Hadoop Cluster
  • Importing Relational Data with Apache Sqoop
  • Apache Spark Basics
  • Working with RDDs
  • Aggregating Data with Pair RDDs
  • Writing and Running Apache Spark Applications
  • Configuring Apache Spark Applications
  • Parallel Processing in Apache Spark
  • RDD Persistence
  • Common Patterns in Apache Spark Data Processing
  • DataFrames and Spark SQL
  • Message Processing with Apache Kafka
  • Capturing Data with Apache Flume
  • Integrating Apache Flume and Apache Kafka
  • Apache Spark Streaming: Introduction to DStreams
  • Apache Spark Streaming: Processing Multiple Batches
  • Apache Spark Streaming: Data Sources
  • Conclusion

Duration: 4 Days

Mode: Online Instructor- Led

Fees: $1200

  • Basic knowledge of SQL and Linux will help to gear up
Name :
Email :
Contact Number :
Message :
Code :