Udemy

AWS Cloud Projects for Data & AI Engineers: 5 Projects

Enroll Now
  • 30 Students
  • Updated 10/2025
4.5
(01 Ratings)
CTgoodjobs selects quality courses to enhance professionals' competitiveness. By purchasing courses through links on our site, we may receive an affiliate commission.

Course Information

Registration period
Year-round Recruitment
Course Level
Study Mode
Duration
4 Hour(s) 0 Minute(s)
Language
English
Taught by
Pravin Mishra | AWS Certified Cloud Practitioner | Solutions Architect
Rating
4.5
(01 Ratings)

Course Overview

AWS Cloud Projects for Data & AI Engineers: 5 Projects

Build a production-ready Lakehouse on AWS (S3, Glue, Athena, Lake Formation) — plus Orchestration, Data Quality & AI.

Build portfolio-grade AWS Cloud projects that mirror real data teams.

This course is 100% hands-on. You’ll design and operate a production-style Data Lakehouse on AWS, enforce Data Governance with Lake Formation, stand up a Redshift Serverless warehouse with SCD2, run a Batch Ops simulation (break/fix/backfill), and prepare AI/ML-ready datasets—exactly how modern orgs work.

You will use S3, Glue (PySpark), Athena, Lake Formation, Glue Catalog, Apache Iceberg, Redshift Serverless (external & managed tables), IAM, Lambda, DynamoDB, CloudWatch/CloudTrail—with a focus on cost, reliability, and auditability.

What you’ll build (5 connected projects)

  • Project 1 — Lakehouse on AWS: S3 + Apache Iceberg
    Land RAW to S3, transform with Glue, publish Iceberg bronze/silver, implement partitioning & schema evolution, and gate publishes with data quality checks.

  • Project 2 — Data Governance with Lake Formation
    Enforce tag-based policies (LF-Tags), column masking and row-level filters (Data Cells Filters). Prove access in Athena (Analyst vs Scientist). Add lightweight audit.

  • Project 3 — Data Warehouse on Redshift Serverless (External + SCD2)
    Expose Iceberg via external tables, build star schema (facts/dims), implement SCD2 with MERGE, and tune performance/cost (sort/dist keys, WLM/workgroup choices).

  • Project 4 — A Day in the Life of a Data Engineer (Batch Ops Simulation)
    Orchestrate ingest → DQ → publish, handle schema change / late data, rerun safely, backfill last N days, and write a clear incident postmortem.

  • Project 5 — AI/ML Readiness & Serving
    Curate ML-friendly/feature-like tables, ensure reproducible training sets using Iceberg snapshots/time travel, and (optional) integrate SageMaker/Athena for model workflows.

Course Content

  • 2 section(s)
  • 15 lecture(s)
  • Section 1 Project 1 — Data Lakehouse on AWS: S3 + Apache Iceberg
  • Section 2 Project 2 — DATA Governance with AWS Lake Formation

What You’ll Learn

  • Design an AWS Data Lakehouse with S3 + Glue + Iceberg + Athena + Glue Catalog, Apply Data Governance using Lake Formation: LF-Tags/TBAC, PII masking, row-level security, Build a Redshift Serverless warehouse: external tables over Iceberg, star schema, SCD2 with MERGE, Operate batch pipelines: orchestrate runs, handle break/fix, idempotent replays, and backfills, Validate data with quality checks and use auditing/lineage (Lambda+DynamoDB, CloudWatch/CloudTrail), Produce ML-ready datasets and reproducible training views via Iceberg snapshots/time travel


Reviews

  • j
    j .thomas
    4.5

    this what exactly i was looking for, excellent

Start FollowingSee all

We use cookies to enhance your experience on our website. Please read and confirm your agreement to our Privacy Policy and Terms and Conditions before continue to browse our website.

Read and Agreed