Open In App
Related Articles

R – Data Frames

Improve
Improve
Improve
Like Article
Like
Save Article
Save
Report issue
Report

R Programming Language is an open-source programming language that is widely used as a statistical software and data analysis tool. Data Frames in R Language are generic data objects of R that are used to store tabular data. 

Data frames can also be interpreted as matrices where each column of a matrix can be of different data types. R DataFrame is made up of three principal components, the data, rows, and columns. 

R Data Frames Structure

As you can see in the image below, this is how a data frame is structured.

The data is presented in tabular form, which makes it easier to operate and understand.

R - Data FramesGeeksforgeeks

R – Data Frames

Create Dataframe in R Programming Language

To create an R data frame use data.frame() function and then pass each of the vectors you have created as arguments to the function.

R

# R program to create dataframe
 
# creating a data frame
friend.data <- data.frame(
    friend_id = c(1:5),
    friend_name = c("Sachin", "Sourav",
                    "Dravid", "Sehwag",
                    "Dhoni"),
    stringsAsFactors = FALSE
)
# print the data frame
print(friend.data)

                    

Output:

  friend_id friend_name
1 1 Sachin
2 2 Sourav
3 3 Dravid
4 4 Sehwag
5 5 Dhoni

Get the Structure of the R Data Frame

One can get the structure of the R data frame using str() function in R.

It can display even the internal structure of large lists which are nested. It provides one-liner output for the basic R objects letting the user know about the object and its constituents. 

R

# R program to get the
# structure of the data frame
 
# creating a data frame
friend.data <- data.frame(
    friend_id = c(1:5),
    friend_name = c("Sachin", "Sourav",
                    "Dravid", "Sehwag",
                    "Dhoni"),
    stringsAsFactors = FALSE
)
# using str()
print(str(friend.data))

                    

Output:

'data.frame':    5 obs. of  2 variables:
$ friend_id : int 1 2 3 4 5
$ friend_name: chr "Sachin" "Sourav" "Dravid" "Sehwag" ...
NULL

Summary of Data in the R data frame

In the R data frame, the statistical summary and nature of the data can be obtained by applying summary() function.

It is a generic function used to produce result summaries of the results of various model fitting functions. The function invokes particular methods which depend on the class of the first argument. 

R

# R program to get the
# summary of the data frame
 
# creating a data frame
friend.data <- data.frame(
    friend_id = c(1:5),
    friend_name = c("Sachin", "Sourav",
                    "Dravid", "Sehwag",
                    "Dhoni"),
    stringsAsFactors = FALSE
)
# using summary()
print(summary(friend.data))

                    

Output:

   friend_id friend_name       
Min. :1 Length:5
1st Qu.:2 Class :character
Median :3 Mode :character
Mean :3
3rd Qu.:4
Max. :5

Extract Data from Data Frame in R 

Extracting data from an R data frame means that to access its rows or columns. One can extract a specific column from an R data frame using its column name. 

R

# R program to extract
# data from the data frame
 
# creating a data frame
friend.data <- data.frame(
    friend_id = c(1:5),
    friend_name = c("Sachin", "Sourav",
                    "Dravid", "Sehwag",
                    "Dhoni"),
    stringsAsFactors = FALSE
)
 
# Extracting friend_name column
result <- data.frame(friend.data$friend_name)
print(result)

                    

Output:

  friend.data.friend_name
1 Sachin
2 Sourav
3 Dravid
4 Sehwag
5 Dhoni

Expand Data Frame in R Language

A data frame in R can be expanded by adding new columns and rows to the already existing R data frame. 

R

# R program to expand
# the data frame
 
# creating a data frame
friend.data <- data.frame(
    friend_id = c(1:5),
    friend_name = c("Sachin", "Sourav",
                    "Dravid", "Sehwag",
                    "Dhoni"),
    stringsAsFactors = FALSE
)
 
# Expanding data frame
friend.data$location <- c("Kolkata", "Delhi",
                       "Bangalore", "Hyderabad",
                       "Chennai")
resultant <- friend.data
# print the modified data frame
print(resultant)

                    

Output:

  friend_id friend_name  location
1 1 Sachin Kolkata
2 2 Sourav Delhi
3 3 Dravid Bangalore
4 4 Sehwag Hyderabad
5 5 Dhoni Chennai

In R, one can perform various types of operations on a data frame like accessing rows and columns, selecting the subset of the data frame, editing data frames, delete rows and columns in a data frame, etc.

Please refer to DataFrame Operations in R to know about all types of operations that can be performed on a data frame.

Access Items in R Data Frame

We can select and access any element from data frame by using single $ ,brackets [ ] or double brackets [[]]  to access columns from a data frame.

R

# creating a data frame
friend.data <- data.frame(
  friend_id = c(1:5),
  friend_name = c("Sachin", "Sourav",
                  "Dravid", "Sehwag",
                  "Dhoni"),
  stringsAsFactors = FALSE
)
 
# Access Items using []
friend.data[1]
 
# Access Items using [[]]
friend.data[['friend_name']]
 
# Access Items using $
friend.data$friend_id

                    

Output:

  friend_id
1 1
2 2
3 3
4 4
5 5
Access Items using [[]]
[1] "Sachin" "Sourav" "Dravid" "Sehwag" "Dhoni"
Access Items using $
[1] 1 2 3 4 5

Amount of Rows and Columns

We can find out how many rows and columns parsant in our dataframe by using dim function.

R

# creating a data frame
friend.data <- data.frame(
  friend_id = c(1:5),
  friend_name = c("Sachin", "Sourav",
                  "Dravid", "Sehwag",
                  "Dhoni"),
  stringsAsFactors = FALSE
)
 
# find out the number of rows and clumns
dim(friend.data)

                    

Output:

[1] 5 2

Add Rows and Columns in R Data Frame

You can easily add rows and columns in a R DataFrame. Insertion helps in expanding the already existing DataFrame, without needing a new one.

Let’s look at how to add rows and columns in a DataFrame ? with an example:

Add Rows in R Data Frame

To add rows in a Data Frame, you can use a built-in function rbind().

Following example demonstrate the working of rbind() in R Data Frame.

R

# Creating a dataframe representing products in a store
Products <- data.frame(
  Product_ID = c(101, 102, 103),
  Product_Name = c("T-Shirt", "Jeans", "Shoes"),
  Price = c(15.99, 29.99, 49.99),
  Stock = c(50, 30, 25)
)
 
# Print the existing dataframe
cat("Existing dataframe (Products):\n")
print(Products)
 
# Adding a new row for a new product
New_Product <- c(104, "Sunglasses", 39.99, 40)
Products <- rbind(Products, New_Product)
 
# Print the updated dataframe after adding the new product
cat("\nUpdated dataframe after adding a new product:\n")
print(Products)

                    

Output:

Existing dataframe (Products):

Product_ID Product_Name Price Stock
1 101 T-Shirt 15.99 50
2 102 Jeans 29.99 30
3 103 Shoes 49.99 25

Updated dataframe after adding a new product:

Product_ID Product_Name Price Stock
1 101 T-Shirt 15.99 50
2 102 Jeans 29.99 30
3 103 Shoes 49.99 25
4 104 Sunglasses 39.99 40

Add Columns in R Data Frame

To add columns in a Data Frame, you can use a built-in function cbind().

Following example demonstrate the working of cbind() in R Data Frame.

R

# Existing dataframe representing products in a store
Products <- data.frame(
  Product_ID = c(101, 102, 103),
  Product_Name = c("T-Shirt", "Jeans", "Shoes"),
  Price = c(15.99, 29.99, 49.99),
  Stock = c(50, 30, 25)
)
 
# Print the existing dataframe
cat("Existing dataframe (Products):\n")
print(Products)
 
# Adding a new column for 'Discount' to the dataframe
Discount <- c(5, 10, 8)  # New column values for discount
Products <- cbind(Products, Discount)
 
# Rename the added column
colnames(Products)[ncol(Products)] <- "Discount"  # Renaming the last column
 
# Print the updated dataframe after adding the new column
cat("\nUpdated dataframe after adding a new column 'Discount':\n")
print(Products)

                    

Output:

Existing dataframe (Products):

Product_ID Product_Name Price Stock
1 101 T-Shirt 15.99 50
2 102 Jeans 29.99 30
3 103 Shoes 49.99 25

Updated dataframe after adding a new column 'Discount':

Product_ID Product_Name Price Stock Discount
1 101 T-Shirt 15.99 50 5
2 102 Jeans 29.99 30 10
3 103 Shoes 49.99 25 8

Remove Rows and Columns 

A data frame in R removes columns and rows from the already existing R data frame. 

Remove Row in R DataFrame

R

library(dplyr)
# Create a data frame
data <- data.frame(
  friend_id = c(1, 2, 3, 4, 5),
  friend_name = c("Sachin", "Sourav", "Dravid", "Sehwag", "Dhoni"),
  location = c("Kolkata", "Delhi", "Bangalore", "Hyderabad", "Chennai")
)
 
data
 
# Remove a row with friend_id = 3
data <- subset(data, friend_id != 3)
 
data

                    

Output:

  friend_id friend_name  location
1 1 Sachin Kolkata
2 2 Sourav Delhi
3 3 Dravid Bangalore
4 4 Sehwag Hyderabad
5 5 Dhoni Chennai

# Remove a row with friend_id = 3

friend_id friend_name location
1 1 Sachin Kolkata
2 2 Sourav Delhi
4 4 Sehwag Hyderabad
5 5 Dhoni Chennai

In the above code, we first created a data frame called data with three columns: friend_id, friend_name, and location. To remove a row with friend_id equal to 3, we used the subset() function and specified the condition friend_id != 3. This removed the row with friend_id equal to 3.

Remove Column in R DataFrame

R

library(dplyr)
# Create a data frame
data <- data.frame(
  friend_id = c(1, 2, 3, 4, 5),
  friend_name = c("Sachin", "Sourav", "Dravid", "Sehwag", "Dhoni"),
  location = c("Kolkata", "Delhi", "Bangalore", "Hyderabad", "Chennai")
)
data
 
# Remove the 'location' column
data <- select(data, -location)
 
data

                    

Output:

  friend_id friend_name  location
1 1 Sachin Kolkata
2 2 Sourav Delhi
3 3 Dravid Bangalore
4 4 Sehwag Hyderabad
5 5 Dhoni Chennai
>
Remove the 'location' column

friend_id friend_name
1 1 Sachin
2 2 Sourav
3 3 Dravid
4 4 Sehwag
5 5 Dhoni

To remove the location column, we used the select() function and specified -location. The sign indicates that we want to remove the location column. The resulting data frame data will have only two columns: friend_id and friend_name.

Combining Data Frames in R

There are 2 way to combine data frames in R. You can either combine them vertically or horizontally.

Let’s look at both cases with example:

Combine R Data Frame Vertically

If you want to combine 2 data frames vertically, you can use rbind() function. This function works for combination of two or more data frames.

R

# Creating two sample dataframes
df1 <- data.frame(
  Name = c("Alice", "Bob"),
  Age = c(25, 30),
  Score = c(80, 75)
)
 
df2 <- data.frame(
  Name = c("Charlie", "David"),
  Age = c(28, 35),
  Score = c(90, 85)
)
 
# Print the existing dataframes
cat("Dataframe 1:\n")
print(df1)
 
cat("\nDataframe 2:\n")
print(df2)
 
# Combining the dataframes using rbind()
combined_df <- rbind(df1, df2)
 
# Print the combined dataframe
cat("\nCombined Dataframe:\n")
print(combined_df)

                    

Output:

Dataframe 1:

Name Age Score
1 Alice 25 80
2 Bob 30 75

Dataframe 2:

Name Age Score
1 Charlie 28 90
2 David 35 85

Combined Dataframe:

Name Age Score
1 Alice 25 80
2 Bob 30 75
3 Charlie 28 90
4 David 35 85

Combine R Data Frame Horizontally:

If you want to combine 2 data frames horizontally, you can use cbind() function. This function works for combination of two or more data frames.

R

# Creating two sample dataframes
df1 <- data.frame(
  Name = c("Alice", "Bob"),
  Age = c(25, 30),
  Score = c(80, 75)
)
 
df2 <- data.frame(
  Height = c(160, 175),
  Weight = c(55, 70)
)
 
# Print the existing dataframes
cat("Dataframe 1:\n")
print(df1)
 
cat("\nDataframe 2:\n")
print(df2)
 
# Combining the dataframes using cbind()
combined_df <- cbind(df1, df2)
 
# Print the combined dataframe
cat("\nCombined Dataframe:\n")
print(combined_df)

                    

Output:

Dataframe 1:

Name Age Score
1 Alice 25 80
2 Bob 30 75

Dataframe 2:

Height Weight
1 160 55
2 175 70

Combined Dataframe:

Name Age Score Height Weight
1 Alice 25 80 160 55
2 Bob 30 75 175 70

Also Read:

In this article we have covered R Data Frames, and all basic operations like create, access, summary, add and remove. This article purposes to make you familiar with data frames in R so that you can use it in your projects.

Hope this helps you in understanding the concept of data frames in R and you can easily implement R data frame in your projects.



Last Updated : 15 Dec, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads