Course Information
Course Overview
Practice for CCA175 Test | ETL Qs | Spark 2.4 Hadoop Cluster VM | Cloudera Spark & Hadoop Developer | Includes Data
*Important Notice*
This course has been retired and is no longer receiving support. Originally designed to help students pass the now-retired Cloudera Certification exams, the material remains useful for those wanting to practice their skills on Spark and Hadoop clusters. However, its primary focus was certification preparation, which many students successfully completed.
Prepare for the transform, stage and store section of the CCA Spark & Hadoop Developer certification and helps pass the CCA175 exam.
Students enrolling on this course can be 100% confident that after working on the problems contained here they will be in a great position to pass the transform, stage and store section of the CCA175 exam.
As the number of vacancies for big data, machine learning & data science roles continue to grow, so too will the demand for qualified individuals to fill those roles.
It’s often the case the case that to stand out from the crowd, it’s necessary to get certified.
This exam preparation series has been designed to help YOU pass the Cloudera certification CCA175, this is a hands-on, practical exam where the primary focus is on using Apache Spark to solve Big Data problems.
On solving the problems contained here you’ll have all the necessary skills & the confidence to handle any transform, stage & store related questions that come your way in the exam.
(a) There are 30 problems in this part of the exam preparation series. All of which are directly related to the transform, stage & store component of the CCA175 exam syllabus.
(b) Fully worked out solutions to all the problems.
(c) Also included is the Verulam Blue virtual machine which is an environment that has a spark Hadoop cluster already installed so that you can practice working on the problems.
• The VM contains a Spark stack which allows you to read and write data to & from the Hadoop file system as well as to store metastore tables on the Hive metastore.
• All the datasets you need for the problems are already loaded onto HDFS, so you don’t have to do any extra work.
• The VM also has Apache Zeppelin installed with fully executed Zeppelin notebooks that contain solutions to the problems.
Course Content
- 5 section(s)
- 17 lecture(s)
- Section 1 Introduction
- Section 2 Setting up the working environment
- Section 3 Code-Along - Selected Questions worked
- Section 4 Errata
- Section 5 ** BONUS SECTION **
What You’ll Learn
- Students will get hands-on experience working in a Spark Hadoop environment
- Students will get to practice loading data from HDFS for use in Spark applications
- Students will get to practice writing the results back into HDFS using Spark
- Students will get to practice reading and writing files in a variety of file formats
- Students will get to practice performing standard extract, transform, load (ETL) processes on data using the Spark API
- Students will also get to practice working with Zeppelin notebooks
Skills covered in this course
Reviews
-
LLearner 123
I must say, this is an exceptional resource for anyone looking to enhance their data analysis skills using Spark SQL. The platform that comes with this course is really useful for working on my data analysis and SQL skills. Excellent course!!!
-
FFabio
Muy bien hecho!
-
DDebmalya Panday
Simply A Masterpiece !! Thanks Mathew for putting your efforts in preparing this. I would highly recommend this to everyone preparing for CCA-175
-
SSandeep Shenoy
Sometimes they say the Journey is better than the destination. While I only purchased this course with a certification in view, I started enjoying the quality of question . Add to it the speed with which Matthew responds to our queries sweetens the learning. I have also purchased another course from same author and enjoying that course equally!