CTgoodjobs - PySpark: Python, Spark and Hadoop Coding Framework & Testing

Save Course Compare

Course Information

Registration period

Year-round Recruitment

Course Level

Short Course

Study Mode

Online

Duration

4 Hour(s) 2 Minute(s)

Language

English

Taught by

FutureX Skills

Rating

4.4

(208 Ratings)

Course Overview

PySpark: Python, Spark and Hadoop Coding Framework & Testing

PyCharm : Big Data Python Spark, PySpark Coding Framework, Logging, Error Handling, Unit Testing, PostgreSQL, Hive

This course will bridge the gap between academic learning and real-world applications, preparing you for an entry-level Big Data Python Spark developer role. You will gain hands-on experience and learn industry-standard best practices for developing Python Spark applications. Covering both Windows and Mac environments, this course ensures a smooth learning experience regardless of your operating system.

You will learn Python Spark coding best practices to write clean, efficient, and maintainable code. Logging techniques will help you track application behavior and troubleshoot issues effectively, while error handling strategies will ensure your applications are robust and fault-tolerant. You will also learn how to read configurations from a properties file, making your code more adaptable and scalable. Key Modules :

Python Spark coding best practices for clean, efficient, and maintainable code using PyCharm
Implementing logging to track application behavior and troubleshoot issues
Error handling strategies to build robust and fault-tolerant applications
Reading configurations from a properties file for flexible and scalable code
Developing applications using PyCharm in both Windows and Mac environments
Setting up and using your local environment as a Hadoop Hive environment
Reading and writing data to a Postgres database using Spark
Working with Python unit testing frameworks to validate your Spark applications
Building a complete data pipeline using Hadoop, Spark, and Postgres

Prerequisites:

Basic programming skills
Basic database knowledge
Entry-level understanding of Hadoop

This course uses high-quality AI-generated text-to-speech narration to complement the powerful visuals and enhance your learning experience.

See more details

Course Content

10 section(s)
51 lecture(s)

Section 1 Introduction
Section 2 Setting up Hadoop Spark development environment
Section 3 Creating a PySpark coding framework
Section 4 Logging and Error Handling
Section 5 Creating a Data Pipeline with Hadoop Spark and PostgreSQL
Section 6 Reading configuration from properties file
Section 7 Unit testing PySpark application
Section 8 spark-submit
Section 9 Appendix - Big Data Hadoop Hive for beginners
Section 10 Appendix - PySpark on Colab and DataFrame deep dive

See more details

What You’ll Learn

Python Spark PySpark industry standard coding practices - Logging, Error Handling, reading configuration, unit testing
Building a data pipeline using Hive, Spark and PostgreSQL
Python Spark Hadoop development using PyCharm

Udemy

PySpark: Python, Spark and Hadoop Coding Framework & Testing

Course Information

Course Overview

Course Content

What You’ll Learn

Skills covered in this course

Reviews

Start FollowingSee all

Start FollowingSee all

Courses that Might Interest You

Learning Insight

Media Coverage

Udemy

PySpark: Python, Spark and Hadoop Coding Framework & Testing

Course Information

Course Overview

Course Content

What You’ll Learn

Skills covered in this course

Related Fields of Study

Reviews

Start FollowingSee all

Start FollowingSee all

Courses that Might Interest You

Learning Insight

Media Coverage

Udemy

PySpark: Python, Spark and Hadoop Coding Framework & Testing

Free eNewsletter