Autoplay
Autocomplete
Previous Lesson
Complete and Continue
Machine Learning with Spark
Welcome
What you will learn (1:46)
What is Databricks (8:29)
Getting started (2:01)
Downloading the dataset (2:58)
Downloading the notebooks
Introduction to data science and machine learning
Module Introduction (2:58)
Problems encountered by data scientists (19:39)
Terminology (8:32)
Introduction to regression (11:02)
Introduction to classification (9:04)
Introduction to clustering (9:12)
Labs i - Introduction to regression (14:07)
Labs ii - Introduction to classification (14:12)
Labs iii - Introduction to clustering (14:03)
Feature engineering and selection
Module Introduction (3:30)
Feature scaling and encoding (8:41)
Labs - Feature scaling and encoding (18:46)
Handling and imputing missing values (5:03)
Labs - Handling and imputing missing values (9:05)
Feature selection and dimensionality reduction (15:50)
Labs - Feature selection and dimentionality reduction (16:01)
Metrics
Module Introduction (1:33)
What are Metrics (1:53)
Classification metrics (17:19)
Regression metrics (6:33)
Labs - metrics (18:17)
Pipelines and Tuning
Module Introduction (1:58)
Pipelines (8:39)
Labs - pipelines (12:05)
Model selection and hyperparameter tuning (8:58)
Labs - Hyperparameter tuning with HyperOpt (10:44)
Things to Consider
Module Introduction (2:04)
Underfitting and overfitting (3:08)
Imbalanced classes (6:32)
Labs i - Overfitting (4:48)
Labs ii - Imbalanced classes (11:02)
Introduction to Ensemble Modelling
Module Introduction (1:21)
Theory (10:00)
Labs - Ensemble methods (13:14)
Leverage the Pandas UDF on Spark to scale your Pandas code
Labs - UDFs and Pandas UDFs (31:45)
Handling and imputing missing values
Lesson content locked
If you're already enrolled,
you'll need to login
.
Enroll in Course to Unlock