Udemy

Mastering Databricks & Apache spark -Build ETL data pipeline

Enroll Now
  • 2,515 Students
  • Updated 8/2021
  • Certificate Available
4.1
(435 Ratings)
CTgoodjobs selects quality courses to enhance professionals' competitiveness. By purchasing courses through links on our site, we may receive an affiliate commission.

Course Information

Registration period
Year-round Recruitment
Course Level
Study Mode
Duration
4 Hour(s) 23 Minute(s)
Language
English
Taught by
Priyank Singh
Certificate
  • Available
  • *The delivery and distribution of the certificate are subject to the policies and arrangements of the course provider.
Rating
4.1
(435 Ratings)

Course Overview

Mastering Databricks & Apache spark -Build ETL data pipeline

Learn fundamental concept about databricks and process big data by building your first data pipeline on Azure

Welcome to the course on Mastering Databricks & Apache spark -Build ETL data pipeline

Databricks combines the best of data warehouses and data lakes into a lakehouse architecture. In this course we will be learning how to perform various operations in Scala, Python and Spark SQL. This will help every student in building solutions which will create value and mindset to build batch process in any of the language. This course will help in writing same commands in different language and based on your client needs we can adopt and deliver world class solution. We will be building end to end solution in azure databricks.


Key Learning Points

  • We will be building our own cluster which will process our data and with one click operation we will load different sources data to Azure SQL and Delta tables

  • After that we will be leveraging databricks notebook to prepare dashboard to answer business questions

  • Based on the needs we will be deploying infrastructure on Azure cloud

  • These scenarios will give student 360 degree exposure on cloud platform and how to step up various resources

  • All activities are performed in Azure Databricks


Fundamentals

  • Databricks

  • Delta tables

  • Concept of versions and vacuum on delta tables

  • Apache Spark SQL

  • Filtering Dataframe

  • Renaming, drop, Select, Cast

  • Aggregation operations SUM, AVERAGE, MAX, MIN

  • Rank, Row Number, Dense Rank

  • Building dashboards

  • Analytics

This course is suitable for Data engineers, BI architect, Data Analyst, ETL developer, BI Manager

Course Content

  • 5 section(s)
  • 47 lecture(s)
  • Section 1 Getting Started with Databricks
  • Section 2 Extraction of Data
  • Section 3 Transformation of Data
  • Section 4 Processing XML, JSON, Delta tables
  • Section 5 Loading data and building ETL data pipeline with dashboard

What You’ll Learn

  • Databricks
  • Build your first data pipeline to process CSV, JSON, XML
  • Orchestrate data pipeline on Azure data factory
  • Spin up spark cluster
  • Delta tables
  • Concept of time travel and vacuum on delta tables
  • Apache Spark SQL
  • Filtering Dataframe
  • Renaming, drop, Select, Cast
  • Aggregation operations SUM, AVERAGE, MAX, MIN
  • Rank, Row Number, Dense Rank
  • Building dashboards
  • Build Complete project
  • Build End to End data pipeline


Reviews

  • E
    Eddie Contreras
    4.5

    The course covers topics that relate to engineering quickly and easily. For those with some ML and math and programming, setting up the environ and using each language is pivotal. I really appreciate the time and effort displayed in the course.

  • D
    Daniel Sepp
    5.0

    The course proved to be immensely enriching, perfectly aligning with my interests and effectively augmenting my technical prowess and comprehension of computational intelligence. I deeply appreciate the professor's outstanding teaching approach, which fostered an engaging and nurturing learning atmosphere—truly a treasured experience. Gaining hands-on exposure to Databricks and Azure was proved to be an eye-opening and enlightening experience, reinforcing the course's indisputable value. I'm already applying the acquired knowledge to implement solutions at work, I remain grateful for the unparalleled guidance offered throughout this transformative educational journey. Professor Priyank Singh, warmest greetings from Brazil.

  • V
    Vijay More
    3.0

    Definitely not a mastering course, good for beginners. Bit more explanation on certain topics were expected. I was also doing python learning course on udemy in parallel to this course , trainer was more vocal and keeps you more interested in the course there, that was missing here. Could have been better!

  • T
    Tung Dinh _
    3.0

    Advantage: get familiar to databrick, nothing else. Very basic Spark. Disadvantage: no detailed explanation, minimal conversation. Not all the code is converted to Python, instructor is familiar to Scala. To my best knowledge, this course is only suitable for playing around, not professional code. Absolutely no mastering at all.

Start FollowingSee all

We use cookies to enhance your experience on our website. Please read and confirm your agreement to our Privacy Policy and Terms and Conditions before continue to browse our website.

Read and Agreed