Open In App

How to remove NA values with dplyr filter

Last Updated : 16 Apr, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will examine various methods to remove NA values with dplyr filter by using R Programming Language.

Remove NA values with the dplyr filter

R language offers various methods to remove NA values with dplyr filter efficiently. By using these methods provided by R, it is possible to remove NA values easily. Some of the methods to remove NA values with dplyr filter are.

Remove Rows with NA Values in Any Column

When working with datasets, sometimes it’s necessary to remove entire rows containing any NA values. The an. omit() function from the dplyr package accomplishes this task effortlessly.

R
library(dplyr)

# Create a sample data frame
df <- data.frame(team = c('A', 'A', 'B', 'B', 'C'),
                 points = c(99, 90, 86, 88, NA),
                 assists = c(33, NA, 31, 39, 34),
                 rebounds = c(NA, 28, 24, 24, 28))
df
# Remove rows with NA values in any column
result <- df %>%
  na.omit()

# View the result
print(result)

Output:

  team points assists rebounds
1 A 99 33 NA
2 A 90 NA 28
3 B 86 31 24
4 B 88 39 24
5 C NA 34 28

team points assists rebounds
3 B 86 31 24
4 B 88 39 24

Remove Rows with NA Values in Certain Columns

At times, we might want to remove rows with NA values only in specific columns while retaining other data. dplyr provides the filter_at() function for this purpose. Let’s see how it’s done:

R
library(dplyr)

# Create a sample data frame
df <- data.frame(team = c('A', 'A', 'B', 'B', 'C'),
                 points = c(99, 90, 86, 88, NA),
                 assists = c(33, NA, 31, 39, 34),
                 rebounds = c(NA, 28, 24, 24, 28))
df 
# Remove rows with NA value in 'points' or 'assists' columns
result <- df %>%
  filter_at(vars(points, assists), all_vars(!is.na(.)))

# View the result
print(result)

Output:

  team points assists rebounds
1 A 99 33 NA
2 A 90 NA 28
3 B 86 31 24
4 B 88 39 24
5 C NA 34 28

team points assists rebounds
1 A 99 33 NA
2 B 86 31 24
3 B 88 39 24

Remove Rows with NA Values in One Specific Column

In some scenarios, we might need to focus on a particular column and remove rows with NA values in that column alone. The filter() function combined with !is.na() can achieve this effectively:

R
library(dplyr)

# Create a sample data frame
df <- data.frame(team = c('A', 'A', 'B', 'B', 'C'),
                 points = c(99, 90, 86, 88, NA),
                 assists = c(33, NA, 31, 39, 34),
                 rebounds = c(NA, 28, 24, 24, 28))
df
# Remove rows with NA value in 'points' column
result <- df %>%
  filter(!is.na(points))

# View the result
print(result)

Output:

  team points assists rebounds
1 A 99 33 NA
2 A 90 NA 28
3 B 86 31 24
4 B 88 39 24
5 C NA 34 28

team points assists rebounds
1 A 99 33 NA
2 A 90 NA 28
3 B 86 31 24
4 B 88 39 24

Conclusion

In conclusion, using the filter function from the dplyr package in R allows for effective removal of NA values from data frames. By combining logical conditions and functions like is.na(), one can efficiently filter out rows containing NA values based on specific criteria, contributing to data cleaning and analysis processes.



Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads