The K-Means algorithm:

**Place**`k`

random centroids for the initial clusters.- Assign data samples to the nearest centroid.
- Update centroids based on the above-assigned data samples.

Repeat Steps 2 and 3 until convergence.

After looking at the scatter plot and having a better understanding of the Iris data, let’s start implementing the K-Means algorithm.

In this exercise, we will implement Step 1.

Because we expect there to be three clusters (for the three species of flowers), let’s implement K-Means where the `k`

is 3.

Using the NumPy library, we will create three *random* initial centroids and plot them along with our samples.

### Instructions

**1.**

First, create a variable named `k`

and set it to 3.

**2.**

Then, use NumPy’s `random.uniform()`

function to generate random values in two lists:

- a
`centroids_x`

list that will have`k`

random values between`min(x)`

and`max(x)`

- a
`centroids_y`

list that will have`k`

random values between`min(y)`

and`max(y)`

The `random.uniform()`

function looks like:

np.random.uniform(low, high, size)

The `centroids_x`

will have the x-values for our initial random centroids and the `centroids_y`

will have the y-values for our initial random centroids.

**3.**

Create an array named `centroids`

and use the `zip()`

function to add `centroids_x`

and `centroids_y`

to it.

The `zip()`

function looks like:

np.array(list(zip(array1, array2)))

Then, print `centroids`

.

The `centroids`

list should now have all the initial centroids.

**4.**

Make a scatter plot of `y`

vs `x`

.

Make a scatter plot of `centroids_y`

vs `centroids_x`

.

Show the plots to see your centroids!

# Sign up to start coding

By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.