Open In App

How to Recode Values Using dplyr

Last Updated : 17 Apr, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

Recoding values is a common task in data analysis, and the dplyr package in R Programming Language provides a straightforward way to achieve this using the mutate() function along with other functions like case_when() or recode() from the dplyr package itself or if-else() from base R. Let’s explore how to recode values using dplyr.

Recode Values Using dplyr

Using the dplyr package in R, you can recode values in a data frame using the mutate() function along with case_when().

Using mutate() and case_when()

R
library(dplyr)
# Example dataset
data <- tibble(
  id = 1:5,
  category = c("A", "B", "A", "C", "B"),
  value = c(10, 15, 20, 25, 30)
)
data
# Recode values in 'category' column
data_recode <- data %>%
  mutate(category_recode = case_when(
    category == "A" ~ "Alpha",
    category == "B" ~ "Beta",
    category == "C" ~ "Gamma",
    TRUE ~ category  # Keep other values unchanged
  ))
# View the resulting dataset
print(data_recode)

Output:

# A tibble: 5 × 3
     id category value
  <int> <chr>    <dbl>
1     1 A           10
2     2 B           15
3     3 A           20
4     4 C           25
5     5 B           30

# A tibble: 5 × 4
     id category value category_recode
  <int> <chr>    <dbl> <chr>          
1     1 A           10 Alpha          
2     2 B           15 Beta           
3     3 A           20 Alpha          
4     4 C           25 Gamma          
5     5 B           30 Beta 

In this example, we use case_when() within mutate() to recode values in the ‘category’ column. We specify conditions using ~ to indicate the new value for each condition, and TRUE ~ category to keep other values unchanged.

Using mutate() and recode()

R
library(dplyr)
# Example dataset
data <- tibble(
  id = 1:5,
  category = c("A", "B", "A", "C", "B"),
  value = c(10, 15, 20, 25, 30)
)
data
# Recode values in 'category' column
data_recode <- data %>%
  mutate(category_recode = recode(category,
                                  "A" = "Alpha",
                                  "B" = "Beta",
                                  "C" = "Gamma"))
# View the resulting dataset
print(data_recode)

Output:

# A tibble: 5 × 3
     id category value
  <int> <chr>    <dbl>
1     1 A           10
2     2 B           15
3     3 A           20
4     4 C           25
5     5 B           30

# A tibble: 5 × 4
     id category value category_recode
  <int> <chr>    <dbl> <chr>          
1     1 A           10 Alpha          
2     2 B           15 Beta           
3     3 A           20 Alpha          
4     4 C           25 Gamma          
5     5 B           30 Beta  

Here, recode() is used within mutate() to recode values in the ‘category’ column directly. The function takes arguments in the form original_value = new_value.

Using mutate() and ifelse()

R
library(dplyr)
# Example dataset
data <- tibble(
  id = 1:5,
  category = c("A", "B", "A", "C", "B"),
  value = c(10, 15, 20, 25, 30)
)
data
# Recode values in 'category' column
data_recode <- data %>%
  mutate(category_recode = ifelse(category == "A", "Alpha",
                                   ifelse(category == "B", "Beta",
                                          ifelse(category == "C", "Gamma", category))))
# View the resulting dataset
print(data_recode)

Output:

# A tibble: 5 × 3
     id category value
  <int> <chr>    <dbl>
1     1 A           10
2     2 B           15
3     3 A           20
4     4 C           25
5     5 B           30

# A tibble: 5 × 4
     id category value category_recode
  <int> <chr>    <dbl> <chr>          
1     1 A           10 Alpha          
2     2 B           15 Beta           
3     3 A           20 Alpha          
4     4 C           25 Gamma          
5     5 B           30 Beta  

In this approach, nested ifelse() functions are used within mutate() to recode values in the ‘category’ column. It’s less concise compared to case_when() or recode() but can be useful for simple recoding tasks.

Conclusion

Recoding values using dplyr in R is a straightforward process. Whether you prefer using case_when(), recode(), or ifelse(), mutate() function in combination with these functions allows you to efficiently transform values in your dataset to meet your analysis needs. Choose the approach that best fits your requirements and coding style.



Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads