Unsupervised Learning Basics
K-Means Algorithm: Intro
K-Means Algorithm: 2nd Step
K-Means Using Scikit-Learn
Cross Tabulation Overview
K-Means: Reaching Convergence
Inertia measures how well a dataset was clustered by K-Means. It is calculated by measuring the distance between each data point and its centroid, squaring this distance, and summing these squares across one cluster.
A good model is one with low inertia AND a low number of clusters (
K). However, this is a tradeoff because as
K increases, inertia decreases.
To find the optimal
K for a dataset, use the Elbow method; find the point where the decrease in inertia begins to slow.
K=3 is the “elbow” of this graph.