Skip to Content
Learn
Introduction to Data Frames in R
Review

There you have it! With the power of readr and dplyr in your hands, you can now:

  • load data from a CSV into a data frame
  • inspect the data frame with head() and summary()
  • select() the columns you want to analyze
  • filter() the rows with comparison and logical operators
  • arrange() rows in ascending or descending order

You’ve also been exposed to the pipe %>%, a powerful tool for chaining function calls, as well as the general principles of data manipulation.

Now that you are well on your way to being a dplyr master, let’s combine what you have learned together to perform an analysis and see the true power of the pipe!

Instructions

1.

The code in notebook.Rmd completes a sequence of steps:

  • columns are selected from artists and saved to chosen_cols
  • chosen_cols is filtered and saved to popular_not_hip_hop
  • popular_not_hip_hop is arranged and saved to youtube_desc

Notice that to arrive at this result, two intermediate variables chosen_cols and popular_not_hip_hop were created.

With the power of the pipe, we can clean up this code!

In the last code block, select() all columns except country,year_founded, and albums from artists using the pipe %>%. Save the result to artists and view the head().

2.

Place a pipe %>% after the call to select(). This will pipe your selection to the next line, where you should filter() all rows where spotify_monthly_listeners is greater than 20000000 and genre is not equal to 'Hip Hop'. Keep this data frame saved to artists.

3.

Place a pipe %>% after the call to filter(). This will pipe your filtered data frame to the next line, where you should arrange() the rows in descending order by youtube_subscribers. Keep this data frame saved to artists.

Did you get the same result as the previous code block?

Make sure you’ve called head(artists) to see the resulting data frame!

Folder Icon

Take this course for free

Already have an account?