CTgoodjobs - Mastering Big Data Analytics with PySpark

Save Course Compare

Course Information

Registration period

Year-round Recruitment

Course Level

Short Course

Study Mode

Online

Duration

8 Hour(s) 7 Minute(s)

Language

English

Taught by

Packt Publishing

Certificate

Available
*The delivery and distribution of the certificate are subject to the policies and arrangements of the course provider.

Rating

4.5

(63 Ratings)

6 views

Course Overview

Mastering Big Data Analytics with PySpark

Effectively apply Advanced Analytics to large datasets using the power of PySpark

PySpark helps you perform data analysis at-scale; it enables you to build more scalable analyses and pipelines. This course starts by introducing you to PySpark's potential for performing effective analyses of large datasets. You'll learn how to interact with Spark from Python and connect Jupyter to Spark to provide rich data visualizations. After that, you'll delve into various Spark components and its architecture.

You'll learn to work with Apache Spark and perform ML tasks more smoothly than before. Gathering and querying data using Spark SQL, to overcome challenges involved in reading it. You'll use the DataFrame API to operate with Spark MLlib and learn about the Pipeline API. Finally, we provide tips and tricks for deploying your code and performance tuning.

By the end of this course, you will not only be able to perform efficient data analytics but will have also learned to use PySpark to easily analyze large datasets at-scale in your organization.

About the Author

Danny Meijer works as the Lead Data Engineer in the Netherlands for the Data and Analytics department of a leading sporting goods retailer. He is a Business Process Expert, big data scientist and additionally a data engineer, which gives him a unique mix of skills—the foremost of which is his business-first approach to data science and data engineering.

He has over 13-years' IT experience across various domains and skills ranging from (big) data modeling, architecture, design, and development as well as project and process management; he also has extensive experience with process mining, data engineering on big data, and process improvement.

As a certified data scientist and big data professional, he knows his way around data and analytics, and is proficient in various types of programming language. He has extensive experience with various big data technologies and is fluent in everything: NoSQL, Hadoop, Python, and of course Spark.

Danny is a driven person, motivated by everything data and big-data. He loves math and machine learning and tackling difficult problems.

See more details

Course Content

9 section(s)
41 lecture(s)

Section 1 Python and Spark: A Match Made in Heaven
Section 2 Working with PySpark
Section 3 Preparing Data Using Spark SQL
Section 4 Machine Learning with Spark MLlib
Section 5 Classification and Regression
Section 6 Analyzing Big Data
Section 7 Processing Natural Language in Spark
Section 8 Machine Learning in Real-Time
Section 9 The Power of PySpark

See more details

What You’ll Learn

Gain a solid knowledge of vital Data Analytics concepts via practical use cases
Create elegant data visualizations using Jupyter
Run, process, and analyze large chunks of datasets using PySpark
Utilize Spark SQL to easily load big data into DataFrames
Create fast and scalable Machine Learning applications using MLlib with Spark
Perform exploratory Data Analysis in a scalable way
Achieve scalable, high-throughput and fault-tolerant processing of data streams using Spark Streaming

See more details

Skills covered in this course

Reviews

N
Nivetta T
4.5
nice conceptual explanation .
A
Aravindmohandas
3.0
very fast
F
Fredrik Lundberg
3.5
I would have liked a bit more on the Spark ML library and how the different methods there relates to scalable data analytics
P
Panthea Azadeh
5.0
It was a very informative course. As a follow up or addition it will be very helpful to go through a project A-Z to see all the tricks in action. I specifically would like to get a more hands on practice on running spark in production and scales. Thanks for sharing your expertise with us.

Udemy

Mastering Big Data Analytics with PySpark

Course Information

Course Overview

Course Content

What You’ll Learn

Skills covered in this course

Reviews

Start FollowingSee all

Start FollowingSee all

Courses that Might Interest You

Learning Insight

Media Coverage

Udemy

Mastering Big Data Analytics with PySpark

Course Information

Course Overview

Course Content

What You’ll Learn

Skills covered in this course

Related Fields of Study

Reviews

Start FollowingSee all

Start FollowingSee all

Courses that Might Interest You

Learning Insight

Media Coverage

Udemy

Mastering Big Data Analytics with PySpark

Free eNewsletter