Open In App

How to change row values based on a column value in R dataframe ?

Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will see how to change the values in rows based on the column values in Dataframe in R Programming Language.

Syntax: df[expression ,] <- newrowvalue

Arguments : 

  • df – Data frame to simulate the modification upon
  • expression – Expression to evaluate the cell data based on a column value
  • newrowvalue – The modified value to replace the old value with

Returns : Doesn’t return anything, but makes changes to the data frame. 

The following code snippet is an example of changing the row value based on a column value in R. It checks if in C3 column, the cell value is less than 11, it replaces the corresponding row value, keeping the column the same with NA. This approach takes quadratic time equivalent to the dimensions of the data frame. 

Example:

R




# declaring a data frame in R
data_frame = data.frame(C1= c(5:8),C2 = c(1:4),
                        C3 = c(9:12),C4 =c(13:16))
  
print("Original data frame")
print(data_frame)
  
# replace the row value with NA if the col 
# value in C3 is less than 11 looping over 
# the data frame values
for (i in 1:nrow(data_frame)){
for(j in 1:ncol(data_frame)) {       
          
    # checking if the column is C3 that is 
      # j index is 3
    if(j==3){
          
        # checking if the row value of c3 is
        # less than 11
        if(data_frame[i,j]<11){
              
            # changing the row value in the 
            # data frame
            data_frame[i,j] <- NA
            }
        }    
    }
}
  
# printing modified data frame
print ("Modified data frame")
print (data_frame)


Output:

[1] “Original data frame”

 C1 C2 C3 C4

1  5  1  9 13

2  6  2 10 14

3  7  3 11 15

4  8  4 12 16

[1] “Modified data frame”

 C1 C2 C3 C4

1  5  1 NA 13

2  6  2 NA 14

3  7  3 11 15

4  8  4 12 16

This approach can be optimized, in case we know the index value of the column to carry out the evaluation. In that case, we will not iterate over the entire data frame but only over the column values. 

Example:

R




# declaring a data frame in R
data_frame = data.frame(C1= c(5:8),C2 = c(1:4),
                        C3 = c(9:12),C4 =c(13:16))
  
print("Original data frame")
print(data_frame)
  
# replace the row value with 0 if the
# data element at col index 2 is divisible 
# by 2 looping over the rows of data frame
for (i in 1:nrow(data_frame)){
      
    # iterate over the 2nd column only of the
    # data frame and check if divisible by 2
    if(data_frame[i,2]%%2){
           
        # replace the value with 0
        data_frame[i,2]<-0
        }
}
  
# printing modified data frame
print ("Modified data frame")
print (data_frame)


Output:

[1] “Original data frame”

 C1 C2 C3 C4

1  5  1  9 13

2  6  2 10 14

3  7  3 11 15

4  8  4 12 16

[1] “Modified data frame”

 C1 C2 C3 C4

1  5  0  9 13

2  6  2 10 14

3  7  0 11 15

4  8  4 12 16

R also provides an inbuilt way of handling these row transformations, by simply specifying the condition to be evaluated as the row index of the data frame. The reassigned values are replaced within the data frame. Explicit iteration over the data frame is not required in this case. 

Example:

R




# declaring a data frame in R
data_frame = data.frame(C1= c(1,2,2,1),C2 = c(1:4),
                        C3 = c(9:12),C4 =c(13:16))
  
print("Original data frame")
print(data_frame)
  
# check if c1 value is greater than
# equal to 1, replaced by 3
data_frame[data_frame$C1>=1 ,] <- 3
  
print("Modified data frame")
print(data_frame)


Output:

[1] “Original data frame”

 C1 C2 C3 C4

1  1  1  9 13

2  2  2 10 14

3  2  3 11 15

4  1  4 12 16

[1] “Modified data frame”

 C1 C2 C3 C4

1  3  3  3  3

2  3  3  3  3

3  3  3  3  3

4  3  3  3  3



Last Updated : 21 Apr, 2021
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads