R Programming Language is an open-source programming language that is widely used as a statistical software and data analysis tool. Data Frames in R Language are generic data objects of R that are used to store tabular data.Â
Data frames can also be interpreted as matrices where each column of a matrix can be of different data types. R DataFrame is made up of three principal components, the data, rows, and columns.Â
R Data Frames Structure
As you can see in the image below, this is how a data frame is structured.
The data is presented in tabular form, which makes it easier to operate and understand.
R – Data Frames
Create Dataframe in R Programming Language
To create an R data frame use data.frame() function and then pass each of the vectors you have created as arguments to the function.
R
friend.data <- data.frame (
friend_id = c (1:5),
friend_name = c ( "Sachin" , "Sourav" ,
"Dravid" , "Sehwag" ,
"Dhoni" ),
stringsAsFactors = FALSE
)
print (friend.data)
|
Output:
friend_id friend_name
1 1 Sachin
2 2 Sourav
3 3 Dravid
4 4 Sehwag
5 5 Dhoni
Get the Structure of the R Data Frame
One can get the structure of the R data frame using str() function in R.
It can display even the internal structure of large lists which are nested. It provides one-liner output for the basic R objects letting the user know about the object and its constituents.Â
R
friend.data <- data.frame (
friend_id = c (1:5),
friend_name = c ( "Sachin" , "Sourav" ,
"Dravid" , "Sehwag" ,
"Dhoni" ),
stringsAsFactors = FALSE
)
print ( str (friend.data))
|
Output:
'data.frame': 5 obs. of 2 variables:
$ friend_id : int 1 2 3 4 5
$ friend_name: chr "Sachin" "Sourav" "Dravid" "Sehwag" ...
NULL
Summary of Data in the R data frame
In the R data frame, the statistical summary and nature of the data can be obtained by applying summary() function.
It is a generic function used to produce result summaries of the results of various model fitting functions. The function invokes particular methods which depend on the class of the first argument.Â
R
friend.data <- data.frame (
friend_id = c (1:5),
friend_name = c ( "Sachin" , "Sourav" ,
"Dravid" , "Sehwag" ,
"Dhoni" ),
stringsAsFactors = FALSE
)
print ( summary (friend.data))
|
Output:
friend_id friend_name
Min. :1 Length:5
1st Qu.:2 Class :character
Median :3 Mode :character
Mean :3
3rd Qu.:4
Max. :5
Extract Data from Data Frame in RÂ
Extracting data from an R data frame means that to access its rows or columns. One can extract a specific column from an R data frame using its column name.Â
R
friend.data <- data.frame (
friend_id = c (1:5),
friend_name = c ( "Sachin" , "Sourav" ,
"Dravid" , "Sehwag" ,
"Dhoni" ),
stringsAsFactors = FALSE
)
result <- data.frame (friend.data$friend_name)
print (result)
|
Output:
friend.data.friend_name
1 Sachin
2 Sourav
3 Dravid
4 Sehwag
5 Dhoni
Expand Data Frame in R Language
A data frame in R can be expanded by adding new columns and rows to the already existing R data frame.Â
R
friend.data <- data.frame (
friend_id = c (1:5),
friend_name = c ( "Sachin" , "Sourav" ,
"Dravid" , "Sehwag" ,
"Dhoni" ),
stringsAsFactors = FALSE
)
friend.data$location <- c ( "Kolkata" , "Delhi" ,
"Bangalore" , "Hyderabad" ,
"Chennai" )
resultant <- friend.data
print (resultant)
|
Output:
friend_id friend_name location
1 1 Sachin Kolkata
2 2 Sourav Delhi
3 3 Dravid Bangalore
4 4 Sehwag Hyderabad
5 5 Dhoni Chennai
In R, one can perform various types of operations on a data frame like accessing rows and columns, selecting the subset of the data frame, editing data frames, delete rows and columns in a data frame, etc.
Please refer to DataFrame Operations in R to know about all types of operations that can be performed on a data frame.
Access Items in R Data Frame
We can select and access any element from data frame by using single $
,brackets [ ] or
double brackets [[]]
  to access columns from a data frame.
R
friend.data <- data.frame (
friend_id = c (1:5),
friend_name = c ( "Sachin" , "Sourav" ,
"Dravid" , "Sehwag" ,
"Dhoni" ),
stringsAsFactors = FALSE
)
friend.data[1]
friend.data[[ 'friend_name' ]]
friend.data$friend_id
|
Output:
friend_id
1 1
2 2
3 3
4 4
5 5
Access Items using [[]]
[1] "Sachin" "Sourav" "Dravid" "Sehwag" "Dhoni"
Access Items using $
[1] 1 2 3 4 5
Amount of Rows and Columns
We can find out how many rows and columns parsant in our dataframe by using dim function.
R
friend.data <- data.frame (
friend_id = c (1:5),
friend_name = c ( "Sachin" , "Sourav" ,
"Dravid" , "Sehwag" ,
"Dhoni" ),
stringsAsFactors = FALSE
)
dim (friend.data)
|
Output:
[1] 5 2
Add Rows and Columns in R Data Frame
You can easily add rows and columns in a R DataFrame. Insertion helps in expanding the already existing DataFrame, without needing a new one.
Let’s look at how to add rows and columns in a DataFrame ? with an example:
Add Rows in R Data Frame
To add rows in a Data Frame, you can use a built-in function rbind().
Following example demonstrate the working of rbind() in R Data Frame.
R
Products <- data.frame (
Product_ID = c (101, 102, 103),
Product_Name = c ( "T-Shirt" , "Jeans" , "Shoes" ),
Price = c (15.99, 29.99, 49.99),
Stock = c (50, 30, 25)
)
cat ( "Existing dataframe (Products):\n" )
print (Products)
New_Product <- c (104, "Sunglasses" , 39.99, 40)
Products <- rbind (Products, New_Product)
cat ( "\nUpdated dataframe after adding a new product:\n" )
print (Products)
|
Output:
Existing dataframe (Products):
Product_ID Product_Name Price Stock
1 101 T-Shirt 15.99 50
2 102 Jeans 29.99 30
3 103 Shoes 49.99 25
Updated dataframe after adding a new product:
Product_ID Product_Name Price Stock
1 101 T-Shirt 15.99 50
2 102 Jeans 29.99 30
3 103 Shoes 49.99 25
4 104 Sunglasses 39.99 40
Add Columns in R Data Frame
To add columns in a Data Frame, you can use a built-in function cbind().
Following example demonstrate the working of cbind() in R Data Frame.
R
Products <- data.frame (
Product_ID = c (101, 102, 103),
Product_Name = c ( "T-Shirt" , "Jeans" , "Shoes" ),
Price = c (15.99, 29.99, 49.99),
Stock = c (50, 30, 25)
)
cat ( "Existing dataframe (Products):\n" )
print (Products)
Discount <- c (5, 10, 8)
Products <- cbind (Products, Discount)
colnames (Products)[ ncol (Products)] <- "Discount"
cat ( "\nUpdated dataframe after adding a new column 'Discount':\n" )
print (Products)
|
Output:
Existing dataframe (Products):
Product_ID Product_Name Price Stock
1 101 T-Shirt 15.99 50
2 102 Jeans 29.99 30
3 103 Shoes 49.99 25
Updated dataframe after adding a new column 'Discount':
Product_ID Product_Name Price Stock Discount
1 101 T-Shirt 15.99 50 5
2 102 Jeans 29.99 30 10
3 103 Shoes 49.99 25 8
Remove Rows and ColumnsÂ
A data frame in R removes columns and rows from the already existing R data frame.Â
Remove Row in R DataFrame
R
library (dplyr)
data <- data.frame (
friend_id = c (1, 2, 3, 4, 5),
friend_name = c ( "Sachin" , "Sourav" , "Dravid" , "Sehwag" , "Dhoni" ),
location = c ( "Kolkata" , "Delhi" , "Bangalore" , "Hyderabad" , "Chennai" )
)
data
data <- subset (data, friend_id != 3)
data
|
Output:
friend_id friend_name location
1 1 Sachin Kolkata
2 2 Sourav Delhi
3 3 Dravid Bangalore
4 4 Sehwag Hyderabad
5 5 Dhoni Chennai
# Remove a row with friend_id = 3
friend_id friend_name location
1 1 Sachin Kolkata
2 2 Sourav Delhi
4 4 Sehwag Hyderabad
5 5 Dhoni Chennai
In the above code, we first created a data frame called data with three columns: friend_id, friend_name, and location. To remove a row with friend_id equal to 3, we used the subset() function and specified the condition friend_id != 3. This removed the row with friend_id equal to 3.
Remove Column in R DataFrame
R
library (dplyr)
data <- data.frame (
friend_id = c (1, 2, 3, 4, 5),
friend_name = c ( "Sachin" , "Sourav" , "Dravid" , "Sehwag" , "Dhoni" ),
location = c ( "Kolkata" , "Delhi" , "Bangalore" , "Hyderabad" , "Chennai" )
)
data
data <- select (data, -location)
data
|
Output:
friend_id friend_name location
1 1 Sachin Kolkata
2 2 Sourav Delhi
3 3 Dravid Bangalore
4 4 Sehwag Hyderabad
5 5 Dhoni Chennai
>
Remove the 'location' column
friend_id friend_name
1 1 Sachin
2 2 Sourav
3 3 Dravid
4 4 Sehwag
5 5 Dhoni
To remove the location column, we used the select() function and specified -location. The – sign indicates that we want to remove the location column. The resulting data frame data will have only two columns: friend_id and friend_name.
Combining Data Frames in R
There are 2 way to combine data frames in R. You can either combine them vertically or horizontally.
Let’s look at both cases with example:
Combine R Data Frame Vertically
If you want to combine 2 data frames vertically, you can use rbind() function. This function works for combination of two or more data frames.
R
df1 <- data.frame (
Name = c ( "Alice" , "Bob" ),
Age = c (25, 30),
Score = c (80, 75)
)
df2 <- data.frame (
Name = c ( "Charlie" , "David" ),
Age = c (28, 35),
Score = c (90, 85)
)
cat ( "Dataframe 1:\n" )
print (df1)
cat ( "\nDataframe 2:\n" )
print (df2)
combined_df <- rbind (df1, df2)
cat ( "\nCombined Dataframe:\n" )
print (combined_df)
|
Output:
Dataframe 1:
Name Age Score
1 Alice 25 80
2 Bob 30 75
Dataframe 2:
Name Age Score
1 Charlie 28 90
2 David 35 85
Combined Dataframe:
Name Age Score
1 Alice 25 80
2 Bob 30 75
3 Charlie 28 90
4 David 35 85
Combine R Data Frame Horizontally:
If you want to combine 2 data frames horizontally, you can use cbind() function. This function works for combination of two or more data frames.
R
df1 <- data.frame (
Name = c ( "Alice" , "Bob" ),
Age = c (25, 30),
Score = c (80, 75)
)
df2 <- data.frame (
Height = c (160, 175),
Weight = c (55, 70)
)
cat ( "Dataframe 1:\n" )
print (df1)
cat ( "\nDataframe 2:\n" )
print (df2)
combined_df <- cbind (df1, df2)
cat ( "\nCombined Dataframe:\n" )
print (combined_df)
|
Output:
Dataframe 1:
Name Age Score
1 Alice 25 80
2 Bob 30 75
Dataframe 2:
Height Weight
1 160 55
2 175 70
Combined Dataframe:
Name Age Score Height Weight
1 Alice 25 80 160 55
2 Bob 30 75 175 70
Also Read:
In this article we have covered R Data Frames, and all basic operations like create, access, summary, add and remove. This article purposes to make you familiar with data frames in R so that you can use it in your projects.
Hope this helps you in understanding the concept of data frames in R and you can easily implement R data frame in your projects.
Share your thoughts in the comments
Please Login to comment...