CTgoodjobs - Data Engineering with Spark Databricks Delta Lake Lakehouse

Save Course Compare

Course Information

Registration period

Year-round Recruitment

Course Level

Short Course

Study Mode

Online

Duration

5 Hour(s) 15 Minute(s)

Language

English

Taught by

FutureX Skills

Rating

4.5

(495 Ratings)

3 views

Course Overview

Data Engineering with Spark Databricks Delta Lake Lakehouse

Apache Spark Databricks Lakehouse Delta Lake Delta Tables Delta Caching Scala Python Data Engineering for beginners

Data Engineering is a vital component of modern data-driven businesses. The ability to process, manage, and analyze large-scale data sets is a core requirement for organizations that want to stay competitive. In this course, you will learn how to build a data pipeline using Apache Spark on Databricks' Lakehouse architecture. This will give you practical experience in working with Spark and Lakehouse concepts, as well as the skills needed to excel as a Data Engineer in a real-world environment.

Throughout the Course, You Will Learn:

Conducting analytics using Python and Scala with Spark.
Applying Spark SQL and Databricks SQL for analytics.
Developing a data pipeline with Apache Spark.
Becoming proficient in Databricks' free edition.
Managing a Delta table by accessing version history, restoring data, and utilizing time travel features.
Unity Catalog Volumes - File Storage and Operations
Optimizing query performance using Delta Cache.
Working with Delta Tables and Databricks File System.
Gaining insights into real-world scenarios from experienced instructors.

Course Structure:

Beginning with familiarizing yourself with Databricks' free edition and creating a basic pipeline using Spark.
Progressing to more complex topics after gaining comfort with the platform.
Learning analytics with Spark using Python and Scala, including Spark transformations, actions, joins, Spark SQL, and DataFrame APIs.
Acquiring the knowledge and skills to operate a Delta table, including accessing its version history, restoring data, and utilizing time travel functionality using Spark and Databricks SQL.
Understanding how to use Delta Cache to optimize query performance.

Optional Lectures on AWS Integration:

'Setting up Databricks Account on AWS' and 'Running Notebooks Within a Databricks AWS Account.'
Building an ETL pipeline with Delta Live Tables
Providing additional opportunities to explore Databricks within the AWS ecosystem.

This course is designed for Data Engineering beginners with no prior knowledge of Python and Scala required. However, some familiarity with databases and SQL is necessary to succeed in this course. Upon completion, you will have the skills and knowledge required to succeed in a real-world Data Engineer role.

Throughout the course, you will work with hands-on examples and real-world scenarios to apply the concepts you learn. By the end of the course, you will have the practical experience and skills required to understand Spark and Lakehouse concepts, and to build a scalable and reliable data pipeline using Spark on Databricks' Lakehouse architecture.

This course uses high-quality AI-generated text-to-speech narration to complement the powerful visuals and enhance your learning experience.

See more details

Course Content

7 section(s)
48 lecture(s)

Section 1 Introduction
Section 2 Working with Databricks Storage: DBFS, Volumes, and Delta Tables
Section 3 Data Engineering with Apache Spark
Section 4 Dat Lakehouse Delta Lake and Delta Tables deep dive
Section 5 Databricks Labs on AWS
Section 6 Bonus Section - AWS Data Engineering Labs
Section 7 Conclusion and where to go from here?

See more details

What You’ll Learn

Acquiring the necessary skills to qualify for an entry-level Data Engineering position
Developing a practical comprehension of Data Lakehouse concepts through hands-on experience
Learning to operate a Delta table by accessing its version history, recovering data, and utilizing time travel functionality
Optimizing a delta table with various techniques like caching, partitioning, and z-ordering for faster analytics
Obtaining practical knowledge in constructing a data pipeline through the usage of Apache Spark on the Databricks platform
Doin analytics within a Databricks AWS Account

Udemy

Data Engineering with Spark Databricks Delta Lake Lakehouse

Course Information

Course Overview

Course Content

What You’ll Learn

Reviews

Start FollowingSee all

Start FollowingSee all

Courses that Might Interest You

Learning Insight

Media Coverage

Udemy

Data Engineering with Spark Databricks Delta Lake Lakehouse

Course Information

Course Overview

Course Content

What You’ll Learn

Related Fields of Study

Reviews

Start FollowingSee all

Start FollowingSee all

Courses that Might Interest You

Learning Insight

Media Coverage

Udemy

Data Engineering with Spark Databricks Delta Lake Lakehouse

Free eNewsletter