CTgoodjobs - AI Vision Systems for Self-Driving Cars in Production on AWS

Save Course Compare

Course Information

Registration period

Year-round Recruitment

Course Level

Short Course

Study Mode

Online

Duration

6 Hour(s) 18 Minute(s)

Language

English

Taught by

Patrik Szepesi

Rating

5.0

(03 Ratings)

5 views

Course Overview

AI Vision Systems for Self-Driving Cars in Production on AWS

Computer Vision on AWS: SageMaker, Rekognition, ViTs and Meta's Segment Anything Model for Detection+Segmentation + Math

Building a successful computer vision product—especially for self-driving car perception—starts with two things: strong foundations and real, scalable systems.

In this course, you’ll learn how to build your own autonomous driving–style vision pipeline using Meta’s Segment Anything Model (SAM), Vision Transformers (ViTs), and AWS Rekognition—while actually understanding the math and intuition behind how these models work.

We begin by exploring Vision Transformers from the ground up, focusing on clear, intuitive explanations of patch embeddings, attention mechanisms, and model representations. You’ll see the underlying mathematics of attention, embeddings, and similarity—and how these ideas translate into the perception capabilities modern self-driving stacks rely on. From there, we dive into Meta’s SAM architecture, explaining how prompts, embeddings, and mask decoding work together to produce high-quality segmentation results—again connecting the math to the behavior you observe, without treating the model as a black box.

You’ll then see how these open-source models fit into real-world self-driving perception workflows. We integrate AWS Rekognition for high-level detection and metadata extraction, and combine it with SAM to create automated, pixel-level labeling pipelines—the kind used to scale dataset creation for autonomous driving. Throughout, you’ll learn how model outputs (scores, embeddings, masks) relate to the underlying objectives and representations that make the pipeline reliable.

A strong emphasis is placed on visualization and practical understanding. You’ll inspect masks, bounding boxes, confidence signals, embeddings, and failure cases, and learn how mathematical concepts translate directly into model behavior you can observe, debug, and improve—critical when building perception systems for safety-sensitive applications like self-driving cars.

By the end of the course, you won’t just know how to run SAM or call an AWS API. You’ll understand why the models work, how to combine managed cloud services with open-source research, and how to think like someone building a real computer vision startup focused on scalable autonomous vehicle perception—not just a demo.

This course is ideal if you want to go beyond surface-level tutorials and gain a clear, intuitive understanding of modern computer vision systems—from the math behind Transformers and segmentation to production-grade perception pipelines used in autonomous driving.

See more details

Course Content

8 section(s)
50 lecture(s)

Section 1 What We Are Building
Section 2 Mathematics behind Vision Transformers
Section 3 Mathematics Behind Meta's SAM(Segment Anything Model)
Section 4 Setting up Our AWS Environment
Section 5 Setting up Open Source Models Like Meta's SAM
Section 6 Visualizing our Outputs
Section 7 Saving Results to S3
Section 8 Testing + Setup

See more details

What You’ll Learn

Build an end-to-end auto-labeling pipeline using Segment Anything (SAM) for large-scale image datasets, Understand how Vision Transformers (ViTs) work internally, including patch embeddings and self-attention, Explain the core mathematics behind SAM, including mask decoding and prompt conditioning, Run GPU-accelerated segmentation workloads efficiently using modern deep-learning stacks, Compare SAM ViT-B, ViT-L, and ViT-H models and choose the right one for cost, speed, and accuracy, Integrate AWS Rekognition for high-level object detection and metadata extraction, Combine AWS Rekognition outputs with SAM masks to create precise, pixel-level labels, Visualize segmentation masks, bounding boxes, and confidence scores for model debugging, Analyze trade-offs between open-source CV models and managed cloud services, Image Segmentation, How to Use Open Source Models in AWS Sagemaker, Optimize performance and memory usage when running SAM on large images, Use AWS-based pipelines to scale computer-vision workloads reliably, Bridge the gap between theory (math + models) and practical production pipelines, AWS Rekognition, Object Detection

Udemy

AI Vision Systems for Self-Driving Cars in Production on AWS

Course Information

Course Overview

Course Content

What You’ll Learn

Skills covered in this course

Reviews

Start FollowingSee all

Start FollowingSee all

Courses that Might Interest You

Learning Insight

Media Coverage

Udemy

AI Vision Systems for Self-Driving Cars in Production on AWS

Course Information

Course Overview

Course Content

What You’ll Learn

Skills covered in this course

Related Fields of Study

Reviews

Start FollowingSee all

Start FollowingSee all

Courses that Might Interest You

Learning Insight

Media Coverage

Udemy

AI Vision Systems for Self-Driving Cars in Production on AWS

Free eNewsletter