Learn

When you have a larger DataFrame, you might want to select just a few columns.

For instance, let’s return to a DataFrame of orders from ShoeFly.com:

id first_name last_name email shoe_type shoe_material shoe_color
54791 Rebecca Lindsay [email protected] clogs faux-leather black
53450 Emily Joyce [email protected] ballet flats faux-leather navy
91987 Joyce Waller [email protected] sandals fabric black
14437 Justin Erickson [email protected] clogs faux-leather red

We might just be interested in the customer’s last_name and email. We want a DataFrame like this:

last_name email
Lindsay [email protected]
Joyce [email protected]
Waller [email protected]
Erickson [email protected]

To select two or more columns from a DataFrame, we use a list of the column names. To create the DataFrame shown above, we would use:

new_df = orders[['last_name', 'email']]

*Note: *Make sure that you have a double set of brackets ([[]]), or this command won’t work!

Instructions

1.

Now, you want to compare visits to the Northern and Southern clinics.

Create a variable called clinic_north_south that contains ONLY the data from the columns clinic_north and clinic_south.

2.

When we select multiple columns, do we get a Series or a DataFrame?

After you’ve created the variable, enter the command:

print(type(clinic_north_south))

to see what data type you’ve created.

How is this different from what happened in the previous exercise?

Sign up to start coding

By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.

Already have an account?