Open In App

How to Replace Multiple Values in Data Frame Using dplyr

Replacing multiple values in a data frame involves substituting specific values in one or more columns with new values. This process is often necessary to standardize or clean the data before analysis. In R, the dplyr package offers efficient functions for data manipulation, including mutate() for creating new variables with modified values and case_when() or recode() for replacing multiple values based on conditions in the R Programming Language.

Replace Multiple Value Using mutate() and case_when()

library(dplyr)
# Example dataset
data <- tibble(
  id = 1:5,
  category = c("A", "B", "A", "C", "B"),
  value = c(10, 15, 20, 25, 30)
)
data
# Replace multiple values in 'category' column
data_replaced <- data %>%
  mutate(category = case_when(
    category == "A" ~ "Alpha",
    category == "B" ~ "Beta",
    category == "C" ~ "Gamma",
    TRUE ~ category  # Keep other values unchanged
  ))
# View the resulting dataset
print(data_replaced)

Output:

  id category value
1  1        A    10
2  2        B    15
3  3        A    20
4  4        C    25
5  5        B    30

  id category value
1  1    Alpha    10
2  2     Beta    15
3  3    Alpha    20
4  4    Gamma    25
5  5     Beta    30

In this example, case_when() within mutate() is used to replace multiple values in the 'category' column based on specified conditions.

Replace Multiple Value Using mutate() and recode()

library(dplyr)
# Example dataset
data <- data.frame(
  id = 1:5,
  category = c("A", "B", "A", "C", "B"),
  value = c(10, 15, 20, 25, 30)
)
data
# Replace multiple values in 'category' column
data_replaced <- data %>%
  mutate(category = recode(category,
                           "A" = "Apple",
                           "B" = "Boys",
                           "C" = "Cats"))
# View the resulting dataset
print(data_replaced)

Output:

  id category value
1  1        A    10
2  2        B    15
3  3        A    20
4  4        C    25
5  5        B    30

  id category value
1  1    Apple    10
2  2     Boys    15
3  3    Apple    20
4  4     Cats    25
5  5     Boys    30

Here, recode() within mutate() is used to replace multiple values in the 'category' column directly, providing a more concise approach.

Conclusion

Using dplyr, you can efficiently replace multiple values in a data frame using functions like case_when() or recode() within mutate(). Whether you prefer the flexibility of case_when() or the simplicity of recode(), dplyr provides intuitive tools for data manipulation tasks in R. Choose the approach that best fits your requirements and coding style.

Article Tags :