Course Information
Course Overview
A complete Guide for Processing Big Data with Spark
This course on Apache Spark and Scala aims at providing an advanced expertise in big data Hadoop ecosystem. This course will provide a standard skillset which helps one become a specialist on the top of Big data Hadoop developer.
The course starts with a detailed description on limitations of mapreduce and how Spark can help overcome them. Further it covers a deeper dive into the Scala programming language.
Moving on it covers Spark as a standalone cluster and an understanding of Resiliient Distributed Datasets.
The course also covers concepts of Spark SQL using SQL queries through SQL context and Hive Queries through Hive context.
This course certainly provides material required for building a career path from Big data Hadoop developer to BIg data Hadoop architect.
Course Content
- 12 section(s)
- 67 lecture(s)
- Section 1 Module-1 Introduction to Big data, Hadoop and Spark
- Section 2 Module 2: Introduction to Scala Programming Language
- Section 3 Module 3: Advanced Scala Programming
- Section 4 Module 4: Apache Spark RDDs
- Section 5 Module 5: Apache Spark RDDs II
- Section 6 Module 6: Working with Key-Value pairs
- Section 7 Module 7: Advanced Spark Programming
- Section 8 Module 8: Running Spark jobs on Cluster
- Section 9 Module 9: Spark SQL
- Section 10 Module 10: Spark Streaming
- Section 11 Module 11: Machine Learning in Spark
- Section 12 Module 12: GraphX in Spark
What You’ll Learn
- Understand the limitations of Hadoop mapreduce and how Spark overcomes these limitations, Gain expertise in Scala programming language and its characteristics, Able to work with RDDs' and create applications in Spark, A thorough understanding about Spark SQL by using SQL queries in Spark
Skills covered in this course
Reviews
-
CCarson Stevens
I wish the course included more Spark scripting and creating RDDs from more complex data. The video could also be edited to remove some of the errors. The spacing on the syntax drove me crazy the first half. Just be consistent. The ML section of this course was hardly useful and should be better explored with larger synthetic or real data.
-
aanil panda
no focus on how the function or method is working. Just explaining the output doesnt help in better understanding.
-
AAmit
Initial introduction to Hadoop was very much theoretical. Also the instructor seemed to read out from notes too fast. However, once the first section of Hadoop ends, rest of the sections beginning from scala are really quite good and easy to follow.
-
EElvis Orji
There isn't a real world demo shown. Every examples are done in the command line. But its good for starter.