Skip to Content
Learn
K-Means Clustering
Implementing K-Means: Step 3

The K-Means algorithm:

  1. Place k random centroids for the initial clusters.
  2. Assign data samples to the nearest centroid.
  3. Update centroids based on the above-assigned data samples.

Repeat Steps 2 and 3 until convergence.


In this exercise, we will implement Step 3.

Find new cluster centers by taking the average of the assigned points. To find the average of the assigned points, we can use the .mean() function.

Instructions

1.

Save the old centroids value before updating.

We have already imported deepcopy for you:

from copy import deepcopy

Store centroids into centroids_old using deepcopy():

centroids_old = deepcopy(centroids)
2.

Then, create a for loop that iterates k times.

Since k = 3, as we are iterating through the forloop each time, we can calculate the mean of the points that have the same cluster label.

Inside the for loop, create an array named points where we get all the data points that have the cluster label i.

There are two ways to do this, check the hints to see both!

3.

Then (still inside the for loop), calculate the mean of those points using .mean() to get the new centroid.

Store the new centroid in centroids[i].

The .mean() fucntion looks like:

np.mean(input, axis=0)
4.

Oustide of the for loop, print centroids_old and centroids to see how centroids changed.

Folder Icon

Sign up to start coding

Already have an account?