Open In App

Remove rows with missing values using R

Last Updated : 15 Mar, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will explore various methods to remove rows containing missing values (NA) in the R Programming Language.

What are missing values?

Missing values are the data points that are absent for a specific variable in a dataset. It can be represented in various ways such as Blank spaces, null values, or any special symbols like”NA”.Because of these various reasons missing values can occur, such as data entry errors, malfunction in equipment…etc.Dealing with missing data is a crucial step in data analysis. Some of the methods are.

  1. na.omit()
  2. complete.cases()

Remove rows with missing values using na. omit()

na. omit() function is used for removing NA values that were present in the dataset row-wise. This function checks each row and removes any row that contains one or more NA values, which works more efficiently in manner while dealing with missing values. For example, we have a data frame consisting of rows and columns, using the na.omit() function then all NA values are removed from the dataframe on row-wise.

Syntax:

na.omit(dataframe )

Here, created a dataframe.After using function ‘na.omit()’, it removes all NA values which were present in the dataframe by row-wise.

R
df1= data.frame(  
  A1 = c(NA, 10, NA, 7, 8, 11,20),
  A2 = c("A", 9, 3, "B", "C", "D","E"),
  A3 = c(1, 0, NA, 1, 1, NA,3)
)

#printing the dataframe
print(df1)

print("After removing the NA values ")
result=na.omit(df1)
print(result)

Output:

  A1 A2 A3
1 NA A 1
2 10 9 0
3 NA 3 NA
4 7 B 1
5 8 C 1
6 11 D NA
7 20 E 3

[1] "After removing the NA values "
A1 A2 A3
2 10 9 0
4 7 B 1
5 8 C 1
7 20 E 3

Here, created a dataframe with the help of vectors. After using the function ‘na.omit()’, it removes all NA values which were present in the dataframe by row-wise.

R
vec1=c(1,2,NA,4,8,4)
vec2=c(6,7,8,9,2,9)
vec3=c(34,NA,67,78,23,12)

df1=data.frame(vec1,vec2,vec3)
#printing the dataframe
print(df1)

print("After removing NA values: ")
df2=na.omit(df1)
print(df2)

Output:

  vec1 vec2 vec3
1 1 6 34
2 2 7 NA
3 NA 8 67
4 4 9 78
5 8 2 23
6 4 9 12

[1] "After removing NA values: "
vec1 vec2 vec3
1 1 6 34
4 4 9 78
5 8 2 23
6 4 9 12

Remove rows with missing values using complete.cases()

The complete.cases() is used for removing missing data in a dataframe or in matrix or in a vector. This function can easily filter the rows with missing data and works more efficient in manner . This function is mostly useful, when you want to remove the data based on missing values.

For example, if you have a dataset and you want to remove rows that have missing values,then you can use ‘complete.cases()’ .

Syntax:

complete.cases( dataframe)
R
df1 <- data.frame(  
  A1 = c(NA, 10, NA, 7, 8, 11,20),
  A2 = c("A", 9, 3, "B", "C", "D","E"),
  A3 = c(1, 0, NA, 1, 1, NA,3)
)

#printing the dataframe
print(df1)

print("After removing the NA values ")
result=df1[complete.cases(df1),]
print(result)

Output:

  A1 A2 A3
1 NA A 1
2 10 9 0
3 NA 3 NA
4 7 B 1
5 8 C 1
6 11 D NA
7 20 E 3

[1] "After removing the NA values "
A1 A2 A3
2 10 9 0
4 7 B 1
5 8 C 1
7 20 E 3

Here we created a dataframe with the help of vectors. After ,using the function ‘complete.cases()’ removed all NA values by row-wise.

R
vec1 = c(1,2,NA,4,8,4)
vec2 = c(6,7,8,9,2,9)
vec3 = c(34,NA,67,78,23,12)


#printing the dataframe
print(df1)

print("After removing the NA values ")
result=df1[complete.cases(df1),]
print(result)

Output:

  A1 A2 A3
1 NA A 1
2 10 9 0
3 NA 3 NA
4 7 B 1
5 8 C 1
6 11 D NA
7 20 E 3

[1] "After removing the NA values "
A1 A2 A3
2 10 9 0
4 7 B 1
5 8 C 1
7 20 E 3

Conclusion

In conclusion ,we learned two different methods for removing a missing value by using the functions ‘na.omit() ‘ and ‘ complete.cases() ‘. R language offers versatile tools for efficient data manipulation and analysis.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads