Skip to content
Related Articles

Related Articles

How to create, index and modify Data Frame in R?

View Discussion
Improve Article
Save Article
  • Last Updated : 27 Jun, 2022

In this article, we will discuss how to create a Data frame, index, and modify the data frame in the R programming language.

Creating a Data Frame:

A Data Frame is a two-dimensional labeled data structure. It may consist of fields/columns of different types. It simply looks like a table in SQL or like an excel worksheet. In R, to create a Data Frame use data.frame() method. The syntax to  create a data frame is given as-

data <- data.frame(columnName1=c(
    data1,data2,...),
    ...........
    columnNameN=c(data1,data2,...))

Example:

 In this example let’s look into how to create a Data Frame in R using data.frame() method.

R




# create a data frame 
stats <- data.frame(player=c('A', 'B', 'C', 'D'),
                runs=c(100, 200, 408, NA),
                wickets=c(17, 20, NA, 5))
  
print("stats Dataframe")
stats

Output

"stats Dataframe"
  player runs wickets
1      A  100      17
2      B  200      20
3      C  408      NA
4      D   NA       5

Indexing the Data Frame:

To access the particular data in the Data Frame use square brackets and specify the column name or row numbers, and column numbers to fetch. Let’s look into the syntaxes of different ways of indexing a data frame.

# fetching the data in particular column
data["columnName"]

# fetching data of specified rows and 
# columns
data[ fromRow : toRow , columnNumber]

# fetches first row to third row 
# and second column
Eg:- data[1:3,2]  

Example:

In the below code we created a data frame and performed indexing on it by fetching the data in the specified rows and particular columns.

R




# create a data frame 
stats <- data.frame(player=c('A', 'B', 'C', 'D'),
                runs=c(100, 200, 408, NA),
                wickets=c(17, 20, NA, 5))
  
print("stats Dataframe")
stats
  
# fetch data in certain column
stats["player"]
print("----------")
  
# fetch certain rows and columns
stats[1:3,2]

Output

"stats Dataframe"
  player runs wickets
1      A  100      17
2      B  200      20
3      C  408      NA
4      D   NA       5
----------
  player
1      A
2      B
3      C
4      D
----------
100 200 408

Modify the Data Frame:

Data Modification in a Data Frame

To modify the data in a data frame, we use indexing and reassignment techniques. Let’s look into the syntax of how to modify the data in a data frame.

data[rowNumber, columnName] <- “newValue”

Adding a row to a Data Frame

To add a row in the data frame use rbind() function which accepts two parameters. One is a data frame and the other is the row we need to insert as a list of elements. The syntax of rbind is given below-

rbind( dataframeName, list( data1, data2, …))

Adding a column to a Data Frame

To add a column to a data frame use cbind() function which accepts two parameters. One is a data frame to which we add a new column and the other is data in the new column with the column name. Below is the syntax of cbind() function.

cbind( dataframeName, columnName = c(data1, data2, …))

Removing a row and column from a Data Frame

To remove a row and column from a data frame using the below syntax

#  remove row from a dataframe
# deletes the row of specified row number 
dataframeName <- dataframeName[-rowNumber,]

# remove column from a dataframe
dataframeName$columnName <- NULL

Example: 

In the example, we created a data frame and performed modification operations like insertion, deletion, and modification on the Dataframe.

R




# create a data frame 
stats <- data.frame(player=c('A', 'B', 'C', 'D'),
                runs=c(100, 200, 408, NA),
                wickets=c(17, 20, NA, 5))
  
cat("stats Dataframe\n")
stats
  
# modify the data
stats[4,"runs"] <- 274
cat("\nModified dataframe\n")
stats
  
# added new row
cat("\nDataFrame after a row insertion\n")
stats<-rbind(stats,list('E',500,1))
print(stats)
  
# added new column
cat("\nDataFrame after a new column insertion\n")
stats<-cbind(stats,matches=c(2,3,10,2,12))
print(stats)
  
# deleted the second row
stats<-stats[-2,]
  
# deleted the wickets column
stats$wickets<-NULL
  
cat("\nDataframe after deletion of a row & column\n")
stats

 Output

stats Dataframe
  player runs wickets
1      A  100      17
2      B  200      20
3      C  408      NA
4      D   NA       5

Modified dataframe
  player runs wickets
1      A  100      17
2      B  200      20
3      C  408      NA
4      D  274       5

DataFrame after a row insertion
  player runs wickets
1      A  100      17
2      B  200      20
3      C  408      NA
4      D  274       5
5      E  500       1

DataFrame after a new column insertion
  player runs wickets matches
1      A  100      17       2
2      B  200      20       3
3      C  408      NA      10
4      D  274       5       2
5      E  500       1      12

Dataframe after deletion of a row & column
  player runs matches
1      A  100       2
3      C  408      10
4      D  274       2
5      E  500      12

My Personal Notes arrow_drop_up
Recommended Articles
Page :

Start Your Coding Journey Now!