Course Information
- Available
- *The delivery and distribution of the certificate are subject to the policies and arrangements of the course provider.
Course Overview
Learn fundamental concept about databricks and process big data by building your first data pipeline on Azure
Welcome to the course on Mastering Databricks & Apache spark -Build ETL data pipeline
Databricks combines the best of data warehouses and data lakes into a lakehouse architecture. In this course we will be learning how to perform various operations in Scala, Python and Spark SQL. This will help every student in building solutions which will create value and mindset to build batch process in any of the language. This course will help in writing same commands in different language and based on your client needs we can adopt and deliver world class solution. We will be building end to end solution in azure databricks.
Key Learning Points
We will be building our own cluster which will process our data and with one click operation we will load different sources data to Azure SQL and Delta tables
After that we will be leveraging databricks notebook to prepare dashboard to answer business questions
Based on the needs we will be deploying infrastructure on Azure cloud
These scenarios will give student 360 degree exposure on cloud platform and how to step up various resources
All activities are performed in Azure Databricks
Fundamentals
Databricks
Delta tables
Concept of versions and vacuum on delta tables
Apache Spark SQL
Filtering Dataframe
Renaming, drop, Select, Cast
Aggregation operations SUM, AVERAGE, MAX, MIN
Rank, Row Number, Dense Rank
Building dashboards
Analytics
This course is suitable for Data engineers, BI architect, Data Analyst, ETL developer, BI Manager
Course Content
- 5 section(s)
- 47 lecture(s)
- Section 1 Getting Started with Databricks
- Section 2 Extraction of Data
- Section 3 Transformation of Data
- Section 4 Processing XML, JSON, Delta tables
- Section 5 Loading data and building ETL data pipeline with dashboard
What You’ll Learn
- Databricks
- Build your first data pipeline to process CSV, JSON, XML
- Orchestrate data pipeline on Azure data factory
- Spin up spark cluster
- Delta tables
- Concept of time travel and vacuum on delta tables
- Apache Spark SQL
- Filtering Dataframe
- Renaming, drop, Select, Cast
- Aggregation operations SUM, AVERAGE, MAX, MIN
- Rank, Row Number, Dense Rank
- Building dashboards
- Build Complete project
- Build End to End data pipeline
Skills covered in this course
Reviews
-
EEddie Contreras
The course covers topics that relate to engineering quickly and easily. For those with some ML and math and programming, setting up the environ and using each language is pivotal. I really appreciate the time and effort displayed in the course.
-
DDaniel Sepp
The course proved to be immensely enriching, perfectly aligning with my interests and effectively augmenting my technical prowess and comprehension of computational intelligence. I deeply appreciate the professor's outstanding teaching approach, which fostered an engaging and nurturing learning atmosphere—truly a treasured experience. Gaining hands-on exposure to Databricks and Azure was proved to be an eye-opening and enlightening experience, reinforcing the course's indisputable value. I'm already applying the acquired knowledge to implement solutions at work, I remain grateful for the unparalleled guidance offered throughout this transformative educational journey. Professor Priyank Singh, warmest greetings from Brazil.
-
VVijay More
Definitely not a mastering course, good for beginners. Bit more explanation on certain topics were expected. I was also doing python learning course on udemy in parallel to this course , trainer was more vocal and keeps you more interested in the course there, that was missing here. Could have been better!
-
TTung Dinh _
Advantage: get familiar to databrick, nothing else. Very basic Spark. Disadvantage: no detailed explanation, minimal conversation. Not all the code is converted to Python, instructor is familiar to Scala. To my best knowledge, this course is only suitable for playing around, not professional code. Absolutely no mastering at all.