How to remove NA values with dplyr filter

Last Updated : 16 Apr, 2024

In this article, we will examine various methods to remove NA values with dplyr filter by using R Programming Language.

Remove NA values with the dplyr filter

R language offers various methods to remove NA values with dplyr filter efficiently. By using these methods provided by R, it is possible to remove NA values easily. Some of the methods to remove NA values with dplyr filter are.

Remove Rows with NA Values in Any Column

When working with datasets, sometimes it’s necessary to remove entire rows containing any NA values. The an. omit() function from the dplyr package accomplishes this task effortlessly.

library(dplyr)

# Create a sample data frame
df <- data.frame(team = c('A', 'A', 'B', 'B', 'C'),
                 points = c(99, 90, 86, 88, NA),
                 assists = c(33, NA, 31, 39, 34),
                 rebounds = c(NA, 28, 24, 24, 28))
df
# Remove rows with NA values in any column
result <- df %>%
  na.omit()

# View the result
print(result)

Output:

  team points assists rebounds
1    A     99      33       NA
2    A     90      NA       28
3    B     86      31       24
4    B     88      39       24
5    C     NA      34       28

  team points assists rebounds
3    B     86      31       24
4    B     88      39       24

Remove Rows with NA Values in Certain Columns

At times, we might want to remove rows with NA values only in specific columns while retaining other data. dplyr provides the filter_at() function for this purpose. Let’s see how it’s done:

library(dplyr)

# Create a sample data frame
df <- data.frame(team = c('A', 'A', 'B', 'B', 'C'),
                 points = c(99, 90, 86, 88, NA),
                 assists = c(33, NA, 31, 39, 34),
                 rebounds = c(NA, 28, 24, 24, 28))
df 
# Remove rows with NA value in 'points' or 'assists' columns
result <- df %>%
  filter_at(vars(points, assists), all_vars(!is.na(.)))

# View the result
print(result)

Output:

  team points assists rebounds
1    A     99      33       NA
2    A     90      NA       28
3    B     86      31       24
4    B     88      39       24
5    C     NA      34       28

  team points assists rebounds
1    A     99      33       NA
2    B     86      31       24
3    B     88      39       24

Remove Rows with NA Values in One Specific Column

In some scenarios, we might need to focus on a particular column and remove rows with NA values in that column alone. The filter() function combined with !is.na() can achieve this effectively:

library(dplyr)

# Create a sample data frame
df <- data.frame(team = c('A', 'A', 'B', 'B', 'C'),
                 points = c(99, 90, 86, 88, NA),
                 assists = c(33, NA, 31, 39, 34),
                 rebounds = c(NA, 28, 24, 24, 28))
df
# Remove rows with NA value in 'points' column
result <- df %>%
  filter(!is.na(points))

# View the result
print(result)

Output:

  team points assists rebounds
1    A     99      33       NA
2    A     90      NA       28
3    B     86      31       24
4    B     88      39       24
5    C     NA      34       28

  team points assists rebounds
1    A     99      33       NA
2    A     90      NA       28
3    B     86      31       24
4    B     88      39       24

Conclusion

In conclusion, using the filter function from the dplyr package in R allows for effective removal of NA values from data frames. By combining logical conditions and functions like is.na(), one can efficiently filter out rows containing NA values based on specific criteria, contributing to data cleaning and analysis processes.

Suggest improvement

Remove empty facet with ggplot

How to Validate Input to a Function Error in R

Share your thoughts in the comments

How to remove NA values with dplyr filter

Remove NA values with the dplyr filter

Remove Rows with NA Values in Any Column

Remove Rows with NA Values in Certain Columns

Remove Rows with NA Values in One Specific Column

Conclusion

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?