Course Information
Course Overview
First steps to Extract, Transform and Load data using Apache Beam and Deploy Pipelines on Google Dataflow
This course wants to introduce you to the Apache Foundation's newest data pipeline development framework: The Apache Beam, and how this feature is becoming popular in partnership with Google Dataflow. In a summary, we want to cover the following topics:
1. Understand your inner workings
2. What are your benefits
3. Explain how to use on your local machine without installation via Google Colab for development
4. Its main functions
5. Configure Apache Beam python SDK locallyvice
6. How to deploy this resource on Google Dataflow to a Batch pipeline
This course is dynamic, you will be receiving updates whenever possible.
It is important to remember that this course does not teach Python, but uses it. So, get comfortable with knowing Python basics, defining a function, creating objects and data types.
Also, if you are interested in learning section 4, which consists of deploying a pipeline on Google Dataflow, you will need to have a free counter in GCP. It's a simple process, but it requires a credit card!
I kindly ask you you to consider all the efforts to put this course together and give a nice rate at the end of the course, even tough the course is simple, it was made with all good intent to share knowledge for cheap price. Thanks and hope you enjoy!
___________________________________________________________________________________________________________
Requirements:
· Basic knowledge of Python
· Have Python 3.7 or greater installed locally (from section 4)
· Free account at GCP (from section 4)
Schedule:
· Section 2 – Concepts
· Section 3 – Main Functions
· Section 4 – Apache Beam on Google Dataflow
Course Content
- 3 section(s)
- 21 lecture(s)
- Section 1 Apache Beam Concepts
- Section 2 Apache Beam Main Functions
- Section 3 Apache Beam + GCP = Dataflow
What You’ll Learn
- Apache Beam
- ETL
- Python
- Google Cloud
- DataFlow
- Google Cloud Storage
- Big Query
Skills covered in this course
Reviews
-
AAlejandro Retuert
Ideal para comenzar. Mi sugerencia es que no consuma tanto tiempo en repetir pasos previos y se centre en más opciones disponibles en GCP (uso de Pub/Sub, entre muchos otros servicios), en ver casos en que las cosas van mal y como visualizar los errores en logging, etc.
-
JJuan Pablo Vallejo Figueroa
Introductory, but not production level, lack of structure for real projects
-
PPrasad Kalekar
clear voice would have been great
-
PPradeep Reddy C N
good course