Introduction to Data Frames in R
What is a Data Frame?

A data frame is an R object that stores tabular data in a table structure made up of rows and columns. You can think of a data frame as a spreadsheet or as a SQL table. While data frames can be created in R, they are usually imported with data from a CSV, an Excel spreadsheet, or a SQL query.

Data frames have rows and columns. Each column has a name and stores the values of one variable. Each row contains a set of values, one from each column. The data stored in a data frame can be of many different types: numeric, character, logical, or NA.

A data frame containing the address, age and name of students in a class could look like this:

address age name
123 Main St. 34 John Smith
456 Maple Ave. 28 Jane Doe
789 Broadway 51 Joe Schmo

As seen in the first row, the column names of this data frame are address, age, and name.

Note: when working with dplyr, you might see functions that take a data frame as an argument and output something called a tibble. Tibbles are modern versions of data frames in R, and they operate in essentially the same way. The terms tibble and data frame are often used interchangeably. Here on Codecademy we will use the term data frame!



The code in notebook.Rmd loads a data frame named songs that contains data about 7 songs from popular music groups (you’ll learn how to load a data frame yourself shortly).

Type songs in the empty code block and run the code to view the data frame. Make sure to click the arrow to explore each column!

Folder Icon

Take this course for free

Already have an account?