Remove rows with empty cells in R

Last Updated : 23 May, 2021

A dataframe may contain elements belonging to different data types as cells. However, it may contain blank rows or rows containing missing values in all the columns. These rows are equivalent to dummy records and are termed empty rows. There are multiple ways to remove them.

Method 1: Removing rows using for loop

A vector is declared to keep the indexes of all the rows containing all blank values. A for loop iteration is done over the rows of the dataframe. A counter is set to 0 to store all blank values in each row. Another iteration is done through columns. The cell value is compared to the blank value, and if it satisfies the condition the counter is incremented. After each inner loop iteration, the counter value is compared to the number of columns in the dataframe. If these values are equivalent, the row index is appended to the vector. After the end of the outer loop, the row indices stored in the vector are deleted using the ‘-‘ in front of the row index vector.

The time complexity of this approach is O(m *n ), where m is the number of rows and n is the number of columns.

Example:

R

# declaring a dataframe 
data_frame = data.frame(col1 = c("","b","","","e") ,  
                        col2 = c("",2,"",4,5),  
                        col3= c("",FALSE,"","", TRUE)) 
  
print ("Original dataframe") 
print (data_frame) 
  
# declaring an empty vector to store  
# the rows with all the blank values 
vec <- c() 
  
# looping the rows 
for (i in 1:nrow(data_frame)){ 
    
    # counter for blank values in  
    # each row 
    count = 0 
      
    # looping through columns 
    for(j in 1:ncol(data_frame)){ 
      
        # checking if the value is blank 
        if(isTRUE(data_frame[i,j] == "")){ 
            count = count + 1 
        } 
          
    } 
    
    # if count is equivalent to number  
    # of columns 
    if(count == ncol(data_frame)){ 
      
          # append row number 
        vec <- append(vec,i) 
    } 
} 
  
# deleting rows using index in vector 
data_frame_mod <- data_frame[-vec, ]  
print ("Modified dataframe") 
print (data_frame_mod) 

Output

[1] "Original dataframe"
 col1 col2  col3
1                
2    b    2 FALSE
3                
4         4      
5    e    5  TRUE
[1] "Modified dataframe"
 col1 col2  col3
2    b    2 FALSE
4         4      
5    e    5  TRUE

Method 2: Removing rows with all blank cells in R using apply method

apply() method in R is used to apply a specified function over the R object, vector, dataframe, or a matrix. This method returns a vector or array or list of values obtained by applying the function to the corresponding of an array or matrix.

Syntax: apply(df , axis, FUN, …)

Parameter :

df – A dataframe or matrix

axis – The axis over which to apply the function. For a dataframe, 1 indicates rows, 2 indicates columns and c(1, 2) indicates rows and columns.

FUN – The function to be applied.

The constraint that the dataframe is subjected to is to check that the cell values are not “”, that is blank. In this approach, FUN is equivalent to ‘all’, since all the columns for any particular row should satisfy the condition, of not having a blank cell value.

Example:

R

# declaring an empty dataframe 
data_frame = data.frame(col1 = c("","b","","","e") ,  
                        col2 = c("",2,"",4,5),  
                        col3= c("",FALSE,"","", TRUE)) 
  
print ("Original dataframe") 
print (data_frame) 
  
# checking where the cells are not all empty 
data_frame_mod <- data_frame[!apply(data_frame == "", 1, all), ]   
print ("Modified dataframe") 
print (data_frame_mod ) 

Output

[1] "Original dataframe"
 col1 col2  col3
1                
2    b    2 FALSE
3                
4         4      
5    e    5  TRUE
[1] "Modified dataframe"
 col1 col2  col3
2    b    2 FALSE
4         4      
5    e    5  TRUE

Method 3 : Removing rows with all NA

A dataframe can consist of missing values or NA contained in replacement to the cell values. This approach uses many inbuilt R methods to remove all the rows with NA.

The number of columns of the dataframe can be checked using the ncol() method.

Syntax:

ncol( df)

Individual cell values are checked if the values are NA or not, by using the is.na() method. The dataframe is passed as an argument to this method. It returns a dataframe with dimensions equivalent to the original dataframe. It consists of logical values, TRUE if the value is NA, FALSE otherwise.

Syntax:

na_df <- is.na(df)

The rowSums() method is applied over the dataframe consisting of logical values obtained from the previous step. It returns the count of the total sum of NA values encountered in each row. The resultant vector contains the integer denoting a number of missing values of each row.

Syntax:

rowSums(na_df)

The rows where the row sum of na values of each row is not equivalent to the number of columns, those rows are stored in a separate variable as an output. If the two are equal, it implies that all columns contain NA in that specific row.

Example:

R

# declaring an empty dataframe 
data_frame = data.frame(col1 = c(NA,"b",NA,NA,"e") ,  
                        col2 = c(NA,2,NA,4,5),  
                        col3= c(NA,FALSE,NA,NA, TRUE)) 
  
print ("Original dataframe") 
print (data_frame) 
  
# checking number of columns 
cols <- ncol(data_frame) 
  
# checking for which elements have  
# missing values 
is_na <- is.na(data_frame) 
  
# computes total number of nas  
# encountered in each row 
row_na <- rowSums(is_na) 
  
# checking where the cells are not  
# all NA 
data_frame_mod <- data_frame[row_na != cols, ]   
print ("Modified dataframe") 
print (data_frame_mod ) 

Output

[1] "Original dataframe"
 col1 col2  col3
1 <NA>   NA    NA
2    b    2 FALSE
3 <NA>   NA    NA
4 <NA>    4    NA
5    e    5  TRUE
[1] "Modified dataframe"
 col1 col2  col3
2    b    2 FALSE
4 <NA>    4    NA
5    e    5  TRUE

Suggest improvement

Delete rows with empty cells from Excel using R

Share your thoughts in the comments

Remove rows with empty cells in R

Method 1: Removing rows using for loop

R

Method 2: Removing rows with all blank cells in R using apply method

R

Method 3 : Removing rows with all NA

R

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?