Course Information
Course Overview
"A thorough introduction to Polars" - Ritchie Vink, creator of Polars - over 3,000 learners to date!
In this course I show you how to take advantage of Polars - the fast-growing open source dataframe library that is becoming the go-to dataframe library for data scientists in python. I am a Polars contributor with a focus on making Polars accessible to new users and I keep this course up-to-date with new releases of Polars - updated to version 1.30.0
"A thorough introduction to Polars" - Ritchie Vink, creator of Polars
"Thank you for your great work with this course - I've optimized some code thanks to it already!" Maiia Bocharova
The course is for data scientists who have some familiarity with a dataframe library like Pandas but who want to move to Polars because it is easier to write and faster to run. The core materials are Jupyter notebooks that examine each topic in depth. Each notebook comes with a set of exercises to help you develop your understanding of the core concepts. For many key topics this course is the only source of documentation for learners and comes from my time examining the Polars source code.
An important note about videos: this is a primarily a notebook course and not a video course. Not all of the lectures have videos and some of the videos may have components that are not up-to-date. Why? Because the Polars API has changed too often to allow me to keep videos up-to-date. Instead I focus on keeping the notebooks up-to-date with an extensive automated testing system that alerts me to changes in the API. I release an updated version of the course about twice a month in response to changes in Polars.
The course introduces the syntax of Polars and shows you the many ways that Polars allows you to produce queries that are easy to read and write. However, the course also delves deeper to help you understand and exploit the algorithms that drive the outstanding performance of Polars.
By the end of the course you will have optimised ways to:
load and transform your data from CSV, Excel, Parquet, cloud storage or a database
run your analysis in parallel
understand optimal patterns for building queries
work with larger-than-memory datasets
carry out aggregations on your data
combine your datasets with joins and concatenations
work with nested dtypes including lists and structs
optimise the speed and memory usage of your queries
work with string and categorical data
visualise your outputs with Matplotlib, Seaborn, Plotly, hvPlot & Altair
prepare your data for machine learning pipelines with sklearn
Course Content
- 9 section(s)
- 65 lecture(s)
- Section 1 Up and running with Polars
- Section 2 Filtering rows
- Section 3 Selecting columns and transforming dataframes
- Section 4 Data types and missing values
- Section 5 Grouping and aggregation
- Section 6 Combining dataframes
- Section 7 Input/Output
- Section 8 Time series analysis
- Section 9 Nested dtypes
What You’ll Learn
- Taking advantage of parallel and optimised analysis with Polars
- Working with larger-than-memory data
- Using Polars expressions for analysis that is easy to read and write
- Loading data from a wide variety of data sources
- Combining data from different datasets using fast joins operations
- Grouping and parallel aggregations
- Deriving insight from time series
- Preparing data for machine learning pipelines
- Visualising data with Matplotlib, Seaborn, Altair & Plotly
- Using Polars with Scikit-learn
Skills covered in this course
Reviews
-
MMohammad Al Mashwakhi
Not updated
-
JJohn Ortiz Martinez
It would be best if there were more videos or at least a final project to apply what we have learned. That said it is a good course to learn about polars, the technical level is adequate
-
BBernd Frohlich
I like the material, the way how the concepts are explained and the clarity of the concepts.
-
TTamas Valuska
Good intro into polars.