Split DataFrame Variable into Multiple Columns in R
In this article, we will discuss how to split dataframe variables into multiple columns using R programming language.
Method 1: Using do.call method
The strsplit() method in R is used to split the specified column string vector into corresponding parts. The pattern is used to divide the string into subparts.
Syntax:
strsplit(str, pattern)
Parameter :
- str: The string vector to be split.
- pattern: Pattern to split up the string by.
The do.call() method is used to call a function from within a method name. The rbind() method can then be used to combine the columns obtained as vectors as a result of the application of strsplit method.
Syntax:
do.call(what, args)
Parameter:
- what – The function to execute
- args – Additional arguments to execute.
Example: Split Dataframe variable into multiple columns
R
# creating a dataframe data_frame <- data.frame ( col1 = c ( "val_1" , "val_2" , "val_3" , "val_4" ) ) print ( "Original DataFrame" ) print (data_frame) # splitting values in column print ( "Modified DataFrame" ) # splitting the values of col1 using underscore character data.frame ( do.call ( "rbind" , strsplit ( as.character (data_frame$col1), "_" , fixed = TRUE ))) |
Output:
[1] "Original DataFrame" col1 1 val_1 2 val_2 3 val_3 4 val_4 [1] "Modified DataFrame" X1 X2 1 val 1 2 val 2 3 val 3 4 val 4
Method 2: Using tidyr package
The tidyr package in R is used to mutate and visualize the data. It is used to tidy up the data. The package can be downloaded and installed into the working space using the following command:
install.packages("tidyr")
The separate method in R can be used to split up the specified string column or vector into corresponding sub-parts. The length of the second argument vector is equivalent to the number of pieces to split up the data into.
Syntax:
separate(str, n, pattern)
Parameter:
- str: The string vector to be split.
- n: The names of pieces to split the string into.
- pattern: Pattern to split up the string by.
Example: Split dataframe variable into multiple columns
R
library ( "tidyr" ) # creating a dataframe data_frame <- data.frame ( col1 = c ( "val_1" , "val_2" , "val_3" , "val_4" ) ) print ( "Original DataFrame" ) print (data_frame) # splitting values in column print ( "Modified DataFrame" ) data_frame %>% separate (col1, c ( "col1" , "col2" ), "_" ) |
Output:
[1] "Original DataFrame" col1 1 val_1 2 val_2 3 val_3 4 val_4 [1] "Modified DataFrame" col1 col2 1 val 1 2 val 2 3 val 3 4 val 4
Method 3: Using stringr package
The stringr package in R is used to carry out string manipulations. It helps us perform modifications related to string. The package can be download and installed into the working space using the following command :
install.packages("stringr")
The str_split_fixed method in stringr package is used to split up a string into a fixed number of pieces. The method transforms strings into the specified number of substrings. The specified pattern should be of unit length.
Syntax:
str_split_fixed(str, pattern , n
Parameter :
- str: The string vector to be split.
- pattern: Pattern to split up the string by.
- n: The number of pieces to split the string into.
Example: Split dataframe variable into multiple columns
R
library ( "stringr" ) # creating a dataframe data_frame <- data.frame ( col1 = c ( "val_1" , "val_2" , "val_3" , "val_4" ) ) print ( "Original DataFrame" ) print (data_frame) # splitting values in column print ( "Modified DataFrame" ) str_split_fixed (data_frame$col1, "_" , 2) |
Output:
[1] "Original DataFrame" col1 1 val_1 2 val_2 3 val_3 4 val_4 [1] "Modified DataFrame" [,1] [,2] [1,] "val" "1" [2,] "val" "2" [3,] "val" "3" [4,] "val" "4"
Please Login to comment...