Course Information
Course Overview
Practice data science with 24hs of material using real examples
This course explores several modern machine learning and data science techniques in R. As you probably know, R is one of the most used tools among data scientists. We showcase a wide array of statistical and machine learning techniques. In particular:
- Using R's statistical functions for drawing random numbers, calculating densities, histograms, etc.
- Supervised ML problems using the CARET package
- Data processing using sqldf, caret, etc.
- Unsupervised techniques such as PCA, DBSCAN, K-means
- Calling Deep Learning models in Keras(Python) from R
- Use the powerful XGBOOST method for both regression and classification
- Doing interesting plots, such as geo-heatmaps and interactive plots
- Train ML train hyperparameters for several ML methods using caret
- Do linear regression in R, build log-log models, and do ANOVA analysis
- Estimate mixed effects models to explicitly model the covariances between observations
- Train outlier robust models using robust regression and quantile regression
- Identify outliers and novel observations
- Estimate ARIMA (time series) models to predict temporal variables
Most of the examples presented in this course come from real datasets collected from the web such as Kaggle, the US Census Bureau, etc. All the lectures can be downloaded and come with the corresponding material. The teaching approach is to briefly introduce each technique, and focus on the computational aspect. The mathematical formulas are avoided as much as possible, so as to concentrate on the practical implementations.
This course covers most of what you would need to work as a data scientist, or compete in Kaggle competitions. It is assumed that you already have some exposure to data science / statistics.
Course Content
- 16 section(s)
- 75 lecture(s)
- Section 1 Basics
- Section 2 General R programming
- Section 3 Random numbers, probability and statistics
- Section 4 Advanced data processing using sqldf
- Section 5 Statistical modelling: Linear regression
- Section 6 Statistical modelling: GLM and Nonlinear regression
- Section 7 XGBOOST: Gradient Boosting
- Section 8 Principal components
- Section 9 Machine learning - the CARET package - introduction
- Section 10 Sound
- Section 11 Machine learning - the CARET package - Supervised problems
- Section 12 Unsupervised problems
- Section 13 Deep learning / Neural networks via Keras in R
- Section 14 Time series in R
- Section 15 Visualizing data
- Section 16 Creating R packages
What You’ll Learn
- Do machine learning in R, Process data for modelling
Skills covered in this course
Reviews
-
EEwelina Olbromska
Swiętny kurs.
-
AArunima De
Awesome
-
DDavid Rebolo
## Good: - The course gives a good overview. - The statistics part is very clear and complete. - Many different data sets and real world examples were exposed. ## Should improve: - The theory in machine learning is not deep enough (I had to compliment the course with youtube videos, papers, blogs...). - There are not examples of ML for regression problems. - Sometimes the course seems a little bit improvised. I think some power point presentation could help the student.
-
PPanchajanya Sai
Worst