CTgoodjobs - Spark SQL and Spark 3 using Scala Hands-On with Labs

儲存課程比較

課程資料

報名日期

全年招生

課程級別

短期課程

學習模式

線上教學

修業期

12 小時 0 分鐘

教學語言

英語

授課導師

Durga Viswanatha Raju Gadiraju, Phani Bhushan Bozzam, Vinay Gadiraju

評分

4.5

(3,101 個評分)

1次瀏覽

課程簡介

Spark SQL and Spark 3 using Scala Hands-On with Labs

A comprehensive course on Spark SQL as well as Data Frame APIs using Scala with complementary lab access

As part of this course, you will learn all the key skills to build Data Engineering Pipelines using Spark SQL and Spark Data Frame APIs using Scala as a Programming language. This course used to be a CCA 175 Spark and Hadoop Developer course for the preparation of the Certification Exam. As of 10/31/2021, the exam is sunset and we have renamed it to Spark SQL and Spark 3 using Scala as it covers industry-relevant topics beyond the scope of certification.

About Data Engineering

Data Engineering is nothing but processing the data depending on our downstream needs. We need to build different pipelines such as Batch Pipelines, Streaming Pipelines, etc as part of Data Engineering. All roles related to Data Processing are consolidated under Data Engineering. Conventionally, they are known as ETL Development, Data Warehouse Development, etc. Apache Spark is evolved as a leading technology to take care of Data Engineering at scale.

I have prepared this course for anyone who would like to transition into a Data Engineer role using Spark (Scala). I myself am a proven Data Engineering Solution Architect with proven experience in designing solutions using Apache Spark.

Let us go through the details about what you will be learning in this course. Keep in mind that the course is created with a lot of hands-on tasks which will give you enough practice using the right tools. Also, there are tons of tasks and exercises to evaluate yourself.

Setup of Single Node Big Data Cluster

Many of you would like to transition to Big Data from Conventional Technologies such as Mainframes, Oracle PL/SQL, etc and you might not have access to Big Data Clusters. It is very important for you set up the environment in the right manner. Don't worry if you do not have the cluster handy, we will guide you through support via Udemy Q&A.

Setup Ubuntu-based AWS Cloud9 Instance with the right configuration
Ensure Docker is setup
Setup Jupyter Lab and other key components
Setup and Validate Hadoop, Hive, YARN, and Spark

Are you feeling a bit overwhelmed about setting up the environment? Don't worry!!! We will provide complementary lab access for up to 2 months. Here are the details.

Training using an interactive environment. You will get 2 weeks of lab access, to begin with. If you like the environment, and acknowledge it by providing a 5* rating and feedback, the lab access will be extended to additional 6 weeks (2 months). Feel free to send an email to support@itversity.com to get complementary lab access. Also, if your employer provides a multi-node environment, we will help you set up the material for the practice as part of the live session. On top of Q&A Support, we also provide required support via live sessions.

A quick recap of Scala

This course requires a decent knowledge of Scala. To make sure you understand Spark from a Data Engineering perspective, we added a module to quickly warm up with Scala. If you are not familiar with Scala, then we suggest you go through relevant courses on Scala as Programming Language.

Data Engineering using Spark SQL

Let us, deep-dive into Spark SQL to understand how it can be used to build Data Engineering Pipelines. Spark with SQL will provide us the ability to leverage distributed computing capabilities of Spark coupled with easy-to-use developer-friendly SQL-style syntax.

Getting Started with Spark SQL
Basic Transformations using Spark SQL
Managing Spark Metastore Tables - Basic DDL and DML
Managing Spark Metastore Tables Tables - DML and Partitioning
Overview of Spark SQL Functions
Windowing Functions using Spark SQL

Data Engineering using Spark Data Frame APIs

Spark Data Frame APIs are an alternative way of building Data Engineering applications at scale leveraging distributed computing capabilities of Spark. Data Engineers from application development backgrounds might prefer Data Frame APIs over Spark SQL to build Data Engineering applications.

Data Processing Overview using Spark Data Frame APIs leveraging Scala as Programming Language
Processing Column Data using Spark Data Frame APIs leveraging Scala as Programming Language
Basic Transformations using Spark Data Frame APIs leveraging Scala as Programming Language - Filtering, Aggregations, and Sorting
Joining Data Sets using Spark Data Frame APIs leveraging Scala as Programming Language

All the demos are given on our state-of-the-art Big Data cluster. You can avail of one-month complimentary lab access by reaching out to support@itversity.com with a Udemy receipt.

瀏覽更多詳情

課程章節

10 個章節
232 堂課

第 1 章 Introduction
第 2 章 Setting up Environment using AWS Cloud9
第 3 章 Setting up Environment - Overview of GCP and Provision Ubuntu VM
第 4 章 Setup Hadoop on Single Node Cluster
第 5 章 Setup Hive and Spark on Single Node Cluster
第 6 章 Scala Fundamentals
第 7 章 Overview of Hadoop HDFS Commands
第 8 章 Apache Spark 2 using Scala - Data Processing - Overview
第 9 章 Apache Spark 2 using Scala - Processing Column Data using Pre-defined Functions
第 10 章 Apache Spark 2 using Scala - Basic Transformations using Data Frames

瀏覽更多詳情

課程內容

All the HDFS Commands that are relevant to validate files and folders in HDFS.
Enough Scala to work Data Engineering Projects using Scala as Programming Language
Spark Dataframe APIs to solve the problems using Dataframe style APIs.
Basic Transformations such as Projection, Filtering, Total as well as Aggregations by Keys using Spark Dataframe APIs
Inner as well as outer joins using Spark Data Frame APIs
Ability to use Spark SQL to solve the problems using SQL style syntax.
Basic Transformations such as Projection, Filtering, Total as well as Aggregations by Keys using Spark SQL
Inner as well as outer joins using Spark SQL
Basic DDL to create and manage tables using Spark SQL
Basic DML or CRUD Operations using Spark SQL
Create and Manage Partitioned Tables using Spark SQL
Manipulating Data using Spark SQL Functions
Advanced Analytical or Windowing Functions to perform aggregations and ranking using Spark SQL

瀏覽更多詳情

此課程所涵蓋的技能

評價

P
Prateek Mohanty
4.0
good
J
Jatin Kumar
5.0
Great Course ! Clear explanations, practical examples, and hands-on exercises made learning Spark SQL and DataFrames easy and enjoyable. Highly recommended.
M
Manoja Hosadmane
5.0
The course gave me a clear understanding of Scala and Spark and helped me learn them in depth. It’s well designed and easy to follow.
A
Avinash Nagul
5.0
All good

立即關注瀏覽更多

Udemy
關注
資訊科技
 關注

舉報

立即關注瀏覽更多

Udemy
關注
資訊科技
 關注

Udemy

Spark SQL and Spark 3 using Scala Hands-On with Labs

課程資料

課程簡介

課程章節

課程內容

此課程所涵蓋的技能

評價

立即關注瀏覽更多

立即關注瀏覽更多

你可能感興趣的課程

進修攻略

媒體報道

Udemy

Spark SQL and Spark 3 using Scala Hands-On with Labs

課程資料

課程簡介

課程章節

課程內容

此課程所涵蓋的技能

相關學習範疇

評價

立即關注瀏覽更多

立即關注瀏覽更多

你可能感興趣的課程

進修攻略

媒體報道

Udemy

Spark SQL and Spark 3 using Scala Hands-On with Labs

免費會員專訊