# R – Data Frames

R Programming LanguageÂ is an open-source programming language that is widely used as a statistical software and data analysis tool.Â Data Frames in R LanguageÂ are generic data objects of R that are used to store tabular data.Â

Data frames can also be interpreted as matrices where each column of aÂ matrixÂ can be of different data types. R DataFrame is made up of three principal components, the data, rows, and columns.Â

## R Data Frames Structure

As you can see in the image below, this is how a data frame is structured.

The data is presented in tabular form, which makes it easier to operate and understand.

## Create Dataframe in R Programming Language

To create an R data frame use data.frame() function and then pass each of the vectors you have created as arguments to the function.

## R

 `# R program to create dataframe` `# creating a data frame``friend.data <- ``data.frame``(``    ``friend_id = ``c``(1:5), ``    ``friend_name = ``c``(``"Sachin"``, ``"Sourav"``, ``                    ``"Dravid"``, ``"Sehwag"``, ``                    ``"Dhoni"``),``    ``stringsAsFactors = ``FALSE``)``# print the data frame``print``(friend.data)`

Output:

`  friend_id friend_name1         1      Sachin2         2      Sourav3         3      Dravid4         4      Sehwag5         5       Dhoni`

## Get the Structure of the R Data Frame

One can get the structure of the R data frame using str() function in R.

It can display even the internal structure of large lists which are nested. It provides one-liner output for the basic R objects letting the user know about the object and its constituents.Â

## R

 `# R program to get the``# structure of the data frame` `# creating a data frame``friend.data <- ``data.frame``(``    ``friend_id = ``c``(1:5), ``    ``friend_name = ``c``(``"Sachin"``, ``"Sourav"``, ``                    ``"Dravid"``, ``"Sehwag"``, ``                    ``"Dhoni"``),``    ``stringsAsFactors = ``FALSE``)``# using str()``print``(``str``(friend.data))`

Output:

`'data.frame':    5 obs. of  2 variables: \$ friend_id  : int  1 2 3 4 5 \$ friend_name: chr  "Sachin" "Sourav" "Dravid" "Sehwag" ...NULL`

## Summary of Data in the R data frame

In the R data frame, the statistical summary and nature of the data can be obtained by applying summary() function.

It is a generic function used to produce result summaries of the results of various model fitting functions. The function invokes particular methods which depend on the class of the first argument.Â

## R

 `# R program to get the``# summary of the data frame` `# creating a data frame``friend.data <- ``data.frame``(``    ``friend_id = ``c``(1:5), ``    ``friend_name = ``c``(``"Sachin"``, ``"Sourav"``, ``                    ``"Dravid"``, ``"Sehwag"``, ``                    ``"Dhoni"``),``    ``stringsAsFactors = ``FALSE``)``# using summary()``print``(``summary``(friend.data))`

Output:

`   friend_id friend_name        Min.   :1   Length:5           1st Qu.:2   Class :character   Median :3   Mode  :character   Mean   :3                      3rd Qu.:4                      Max.   :5  `

## Extract Data from Data Frame in RÂ

Extracting data from an R data frame means that to access its rows or columns. One can extract a specific column from an R data frame using its column name.Â

## R

 `# R program to extract``# data from the data frame` `# creating a data frame``friend.data <- ``data.frame``(``    ``friend_id = ``c``(1:5), ``    ``friend_name = ``c``(``"Sachin"``, ``"Sourav"``, ``                    ``"Dravid"``, ``"Sehwag"``, ``                    ``"Dhoni"``),``    ``stringsAsFactors = ``FALSE``)` `# Extracting friend_name column``result <- ``data.frame``(friend.data\$friend_name)``print``(result)`

Output:

`  friend.data.friend_name1                  Sachin2                  Sourav3                  Dravid4                  Sehwag5                   Dhoni`

## Expand Data Frame in R Language

A data frame in R can be expanded by adding new columns and rows to the already existing R data frame.Â

## R

 `# R program to expand``# the data frame` `# creating a data frame``friend.data <- ``data.frame``(``    ``friend_id = ``c``(1:5), ``    ``friend_name = ``c``(``"Sachin"``, ``"Sourav"``, ``                    ``"Dravid"``, ``"Sehwag"``, ``                    ``"Dhoni"``),``    ``stringsAsFactors = ``FALSE``)` `# Expanding data frame``friend.data\$location <- ``c``(``"Kolkata"``, ``"Delhi"``, ``                       ``"Bangalore"``, ``"Hyderabad"``,``                       ``"Chennai"``)``resultant <- friend.data``# print the modified data frame``print``(resultant)`

Output:

`  friend_id friend_name  location1         1      Sachin   Kolkata2         2      Sourav     Delhi3         3      Dravid Bangalore4         4      Sehwag Hyderabad5         5       Dhoni   Chennai`

In R, one can perform various types of operations on a data frame like accessing rows and columns, selecting the subset of the data frame, editing data frames, delete rows and columns in a data frame, etc.

Please refer to DataFrame Operations in R to know about all types of operations that can be performed on a data frame.

## Access Items in R Data Frame

We can select and access any element from data frame by using single `\$` ,bracketsÂ `[ ] or `double bracketsÂ `[[]]`Â Â to access columns from a data frame.

## R

 `# creating a data frame``friend.data <- ``data.frame``(``  ``friend_id = ``c``(1:5), ``  ``friend_name = ``c``(``"Sachin"``, ``"Sourav"``, ``                  ``"Dravid"``, ``"Sehwag"``, ``                  ``"Dhoni"``),``  ``stringsAsFactors = ``FALSE``)` `# Access Items using [] ``friend.data[1]` `# Access Items using [[]]``friend.data[[``'friend_name'``]]` `# Access Items using \$``friend.data\$friend_id`

Output:

`  friend_id1         12         23         34         45         5Access Items using [[]][1] "Sachin" "Sourav" "Dravid" "Sehwag" "Dhoni"  Access Items using \$[1] 1 2 3 4 5`

## Amount of Rows and Columns

We can find out how many rows and columns parsant in our dataframe by using dim function.

## R

 `# creating a data frame``friend.data <- ``data.frame``(``  ``friend_id = ``c``(1:5), ``  ``friend_name = ``c``(``"Sachin"``, ``"Sourav"``, ``                  ``"Dravid"``, ``"Sehwag"``, ``                  ``"Dhoni"``),``  ``stringsAsFactors = ``FALSE``)` `# find out the number of rows and clumns``dim``(friend.data)`

Output:

`[1] 5 2`

## Add Rows and Columns in R Data Frame

You can easily add rows and columns in a R DataFrame. Insertion helps in expanding the already existing DataFrame, without needing a new one.

Let’s look at how to add rows and columns in a DataFrame ? with an example:

### Add Rows in R Data Frame

To add rows in a Data Frame, you can use a built-in function rbind().

Following example demonstrate the working of rbind() in R Data Frame.

## R

 `# Creating a dataframe representing products in a store``Products <- ``data.frame``(``  ``Product_ID = ``c``(101, 102, 103),``  ``Product_Name = ``c``(``"T-Shirt"``, ``"Jeans"``, ``"Shoes"``),``  ``Price = ``c``(15.99, 29.99, 49.99),``  ``Stock = ``c``(50, 30, 25)``)` `# Print the existing dataframe``cat``(``"Existing dataframe (Products):\n"``)``print``(Products)` `# Adding a new row for a new product``New_Product <- ``c``(104, ``"Sunglasses"``, 39.99, 40)``Products <- ``rbind``(Products, New_Product)` `# Print the updated dataframe after adding the new product``cat``(``"\nUpdated dataframe after adding a new product:\n"``)``print``(Products)`

Output:

`Existing dataframe (Products):  Product_ID Product_Name Price Stock1        101      T-Shirt 15.99    502        102        Jeans 29.99    303        103        Shoes 49.99    25Updated dataframe after adding a new product:  Product_ID Product_Name Price Stock1        101      T-Shirt 15.99    502        102        Jeans 29.99    303        103        Shoes 49.99    254        104   Sunglasses 39.99    40`

### Add Columns in R Data Frame

To add columns in a Data Frame, you can use a built-in function cbind().

Following example demonstrate the working of cbind() in R Data Frame.

## R

 `# Existing dataframe representing products in a store``Products <- ``data.frame``(``  ``Product_ID = ``c``(101, 102, 103),``  ``Product_Name = ``c``(``"T-Shirt"``, ``"Jeans"``, ``"Shoes"``),``  ``Price = ``c``(15.99, 29.99, 49.99),``  ``Stock = ``c``(50, 30, 25)``)` `# Print the existing dataframe``cat``(``"Existing dataframe (Products):\n"``)``print``(Products)` `# Adding a new column for 'Discount' to the dataframe``Discount <- ``c``(5, 10, 8)  ``# New column values for discount``Products <- ``cbind``(Products, Discount)` `# Rename the added column``colnames``(Products)[``ncol``(Products)] <- ``"Discount"`  `# Renaming the last column` `# Print the updated dataframe after adding the new column``cat``(``"\nUpdated dataframe after adding a new column 'Discount':\n"``)``print``(Products)`

Output:

`Existing dataframe (Products):  Product_ID Product_Name Price Stock1        101      T-Shirt 15.99    502        102        Jeans 29.99    303        103        Shoes 49.99    25Updated dataframe after adding a new column 'Discount':  Product_ID Product_Name Price Stock Discount1        101      T-Shirt 15.99    50        52        102        Jeans 29.99    30       103        103        Shoes 49.99    25        8`

## Remove Rows and ColumnsÂ

A data frame in R removes columns and rows from the already existing R data frame.Â

## R

 `library``(dplyr)``# Create a data frame``data <- ``data.frame``(``  ``friend_id = ``c``(1, 2, 3, 4, 5),``  ``friend_name = ``c``(``"Sachin"``, ``"Sourav"``, ``"Dravid"``, ``"Sehwag"``, ``"Dhoni"``),``  ``location = ``c``(``"Kolkata"``, ``"Delhi"``, ``"Bangalore"``, ``"Hyderabad"``, ``"Chennai"``)``)` `data` `# Remove a row with friend_id = 3``data <- ``subset``(data, friend_id != 3)` `data`

Output:

`  friend_id friend_name  location1         1      Sachin   Kolkata2         2      Sourav     Delhi3         3      Dravid Bangalore4         4      Sehwag Hyderabad5         5       Dhoni   Chennai # Remove a row with friend_id = 3  friend_id friend_name  location1         1      Sachin   Kolkata2         2      Sourav     Delhi4         4      Sehwag Hyderabad5         5       Dhoni   Chennai`

In the above code, we first created a data frame called data with three columns: friend_id, friend_name, and location. To remove a row with friend_id equal to 3, we used the subset() function and specified the condition friend_id != 3. This removed the row with friend_id equal to 3.

## R

 `library``(dplyr)``# Create a data frame``data <- ``data.frame``(``  ``friend_id = ``c``(1, 2, 3, 4, 5),``  ``friend_name = ``c``(``"Sachin"``, ``"Sourav"``, ``"Dravid"``, ``"Sehwag"``, ``"Dhoni"``),``  ``location = ``c``(``"Kolkata"``, ``"Delhi"``, ``"Bangalore"``, ``"Hyderabad"``, ``"Chennai"``)``)``data` `# Remove the 'location' column``data <- ``select``(data, -location)` `data`

Output:

`  friend_id friend_name  location1         1      Sachin   Kolkata2         2      Sourav     Delhi3         3      Dravid Bangalore4         4      Sehwag Hyderabad5         5       Dhoni   Chennai> Remove the 'location' column   friend_id friend_name1         1      Sachin2         2      Sourav3         3      Dravid4         4      Sehwag5         5       Dhoni`

To remove the location column, we used the select() function and specified -location. The sign indicates that we want to remove the location column. The resulting data frame data will have only two columns: friend_id and friend_name.

## Combining Data Frames in R

There are 2 way to combine data frames in R. You can either combine them vertically or horizontally.

Let’s look at both cases with example:

### Combine R Data Frame Vertically

If you want to combine 2 data frames vertically, you can use rbind() function. This function works for combination of two or more data frames.

## R

 `# Creating two sample dataframes``df1 <- ``data.frame``(``  ``Name = ``c``(``"Alice"``, ``"Bob"``),``  ``Age = ``c``(25, 30),``  ``Score = ``c``(80, 75)``)` `df2 <- ``data.frame``(``  ``Name = ``c``(``"Charlie"``, ``"David"``),``  ``Age = ``c``(28, 35),``  ``Score = ``c``(90, 85)``)` `# Print the existing dataframes``cat``(``"Dataframe 1:\n"``)``print``(df1)` `cat``(``"\nDataframe 2:\n"``)``print``(df2)` `# Combining the dataframes using rbind()``combined_df <- ``rbind``(df1, df2)` `# Print the combined dataframe``cat``(``"\nCombined Dataframe:\n"``)``print``(combined_df)`

Output:

`Dataframe 1:   Name Age Score1 Alice  25    802   Bob  30    75Dataframe 2:     Name Age Score1 Charlie  28    902   David  35    85Combined Dataframe:     Name Age Score1   Alice  25    802     Bob  30    753 Charlie  28    904   David  35    85`

### Combine R Data Frame Horizontally:

If you want to combine 2 data frames horizontally, you can use cbind() function. This function works for combination of two or more data frames.

## R

 `# Creating two sample dataframes``df1 <- ``data.frame``(``  ``Name = ``c``(``"Alice"``, ``"Bob"``),``  ``Age = ``c``(25, 30),``  ``Score = ``c``(80, 75)``)` `df2 <- ``data.frame``(``  ``Height = ``c``(160, 175),``  ``Weight = ``c``(55, 70)``)` `# Print the existing dataframes``cat``(``"Dataframe 1:\n"``)``print``(df1)` `cat``(``"\nDataframe 2:\n"``)``print``(df2)` `# Combining the dataframes using cbind()``combined_df <- ``cbind``(df1, df2)` `# Print the combined dataframe``cat``(``"\nCombined Dataframe:\n"``)``print``(combined_df)`

Output:

`Dataframe 1:   Name Age Score1 Alice  25    802   Bob  30    75Dataframe 2:  Height Weight1    160     552    175     70Combined Dataframe:   Name Age Score Height Weight1 Alice  25    80    160     552   Bob  30    75    175     70`