Apache Spark

Course Description

Spark is an open source processing engine built around speed, ease of use, and analytics. If you have large amounts of data that requires low latency processing that a typical MapReduce program cannot provide, Spark is the way to go.

  • Learn how it performs at speeds up to 100 times faster than Map Reduce for iterative algorithms or interactive data mining.
  • Learn how it provides in-memory cluster computing for lightning fast speed and supports Java, Python, R, and Scala APIs for ease of development.
  • Learn how it can handle a wide range of data processing scenarios by combining SQL, streaming and complex analytics together seamlessly in the same application.
  • Learn how it runs on top of Hadoop, Mesos, standalone, or in the cloud. It can access diverse data sources such as HDFS, Cassandra, HBase, or S3.

Course Details

  • Course duration: 10 weeks
  • Projects & Assignments: 40 Hours
  • Start date: Jan 7th, 2017

Course Syllabus

Using RDD for Creating Applications in Spark