Open In App

How to convert dataframe columns from factors to characters in R?

Last Updated : 26 May, 2021
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will discuss how to convert dataframe columns from factors to characters in R Programming Language. A dataframe can have different types of columns stacked together to form a tubular structure. Easy modification of the columns’ data as well as conversion between data types can be conducted over a dataframe.  R Language provides us with a variety of methods to simulate the data type conversion of columns of the dataframe : 

Method 1 : Using transform() method

transform() method can used to simulate modification in the data object specified in the argument list of this method. The changes have to be explicitly saved into either the same dataframe or a new one. This method can be used to either add new variables to the data or modify the existing ones.

Syntax: transform(data, value)

Arguments :

  • data : The data object to be modified
  • value : The value to be added

Initially the class of the col3 of the dataframe returned by the sapply() method is a factor, which changes to the character, upon application of the transform() method. The data is preserved during this transformation. 

Example:

R




# declare a dataframe
# different data type have been
# indicated for different cols
data_frame <- data.frame(
                col1 = as.character(6:9), 
                col2 = factor(4:7), 
                col3 = factor(letters[1:4])
                )
  
print("Original DataFrame")
print (data_frame)
  
# indicating the data type of each 
# variable 
sapply(data_frame, class)
  
# converting factor type column to
# character
data_frame_col3 <- transform(
  data_frame, col3 = as.character(col3))
  
print("Modified col5 DataFrame")
print (data_frame_col3)
  
# indicating the data type of each variable 
sapply(data_frame_col3, class)


Output

[1] "Original DataFrame"
 col1 col2 col3
1    6    4    a
2    7    5    b
3    8    6    c
4    9    7    d
   col1     col2     col3
"factor" "factor" "factor"
[1] "Modified col5 DataFrame"
 col1 col2 col3
1    6    4    a
2    7    5    b
3    8    6    c
4    9    7    d
      col1        col2        col3
  "factor"    "factor" "character" 

Method 2: Using dplyr package

dplyr package is used to perform data manipulations and abstractions. It is a child of the tidyverse package providing a large number of in-built functions. It can be used to perform data mutation , using the mutate_at() method where additional variables are added as a function of the existing variables. 

Syntax: mutate_at(.data, .vars, .funs)

Arguments : 

  • .data : The data to modify
  • .var : The variable to modify
  • .funs : The function to apply over the variable to be modified.

Initially, the class of the col2 of the dataframe returned by the sapply() method is a factor, which changes to the character, upon application of the mutate_at() method. The data is preserved during this transformation. 

Example:

R




# declare a dataframe
# different data type have been
# indicated for different cols
library(dplyr)
data_frame <- data.frame(
  "col1" = as.character(6:9), 
  "col2" = factor(4:7), 
  "col3" = factor(letters[1:4])
)
  
print("Original DataFrame")
print (data_frame)
  
# indicating the data type of 
# each variable 
sapply(data_frame, class)
  
# converting factor type column 
# to character
data_frame <- data_frame%>%mutate_at(
  "col2", as.character)
  
print("Modified col2 DataFrame")
print (data_frame)
  
# indicating the data type of 
# each variable 
sapply(data_frame, class)


Output

[1] "Original DataFrame" 
> print (data_frame)   
  col1 col2 col3 
1    6    4    a
2    7    5    b 
3    8    6    c 
4    9    7    d 
col1     col2     col3  
"factor" "factor" "factor"  
[1] "Modified col2 DataFrame" 
   col1 col2 col3
1    6    4    a 
2    7    5    b 
3    8    6    c 
4    9    7    d        
col1        col2        col3     
"factor" "character"    "factor" 

Method 3: Using lapply() method

The lapply method in R is applied for operations on list objects and returns a list object of same length as that of the input list object. Each element of this output list is the result of application of the FUN to each of the element of list.

Syntax: lapply(X, FUN, …)

Arguments : 

  • df: The dataframe object to make modifications to.
  • FUN: The function to be applied to each element of the dataframe.

lapply() method returns a vector list object in R. However, we save the output of this result to the list object of the dataframe variable, that is data_frame[], which converts the list implicitly to a dataframe, eliminating the need for explicit conversion. It applies the transformation of factor columns to character across the entire dataframe. Therefore, all column types change to the character.

Example:

R




# declare a dataframe
# different data type have been
# indicated for different cols
data_frame <- data.frame(
                col1 = as.character(6:9), 
                col2 = factor(c('tzx','hi','gfg','cse')), 
                col3 = factor(letters[1:4])
                )
  
print("Original DataFrame")
print (data_frame)
  
# indicating the data type of each 
# variable 
sapply(data_frame, class)
  
# converting factor type column to 
# character
data_frame[] <-lapply(data_frame, as.character)
print("Modified col5 DataFrame")
print (data_frame)
  
# indicating the data type of each variable 
sapply(data_frame, class)


Output

[1] "Original DataFrame"
 col1 col2 col3
1    6  tzx    a
2    7   hi    b
3    8  gfg    c
4    9  cse    d
   col1     col2     col3
"factor" "factor" "factor"
[1] "Modified col5 DataFrame"
 col1 col2 col3
1    6  tzx    a
2    7   hi    b
3    8  gfg    c
4    9  cse    d
      col1        col2        col3
"character" "character" "character" 


Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads