Split DataFrame Variable into Multiple Columns in R
In this article, we will discuss how to split dataframe variables into multiple columns using R programming language.
Method 1: Using do.call method
The strsplit() method in R is used to split the specified column string vector into corresponding parts. The pattern is used to divide the string into subparts.
Syntax:
strsplit(str, pattern)
Parameter :
- str: The string vector to be split.
- pattern: Pattern to split up the string by.
The do.call() method is used to call a function from within a method name. The rbind() method can then be used to combine the columns obtained as vectors as a result of the application of strsplit method.
Syntax:
do.call(what, args)
Parameter:
- what – The function to execute
- args – Additional arguments to execute.
Example: Split Dataframe variable into multiple columns
R
data_frame <- data.frame (
col1 = c ( "val_1" , "val_2" , "val_3" , "val_4" )
)
print ( "Original DataFrame" )
print (data_frame)
print ( "Modified DataFrame" )
data.frame ( do.call ( "rbind" , strsplit ( as.character (data_frame$col1), "_" ,
fixed = TRUE )))
|
Output:
[1] "Original DataFrame"
col1
1 val_1
2 val_2
3 val_3
4 val_4
[1] "Modified DataFrame"
X1 X2
1 val 1
2 val 2
3 val 3
4 val 4
Method 2: Using tidyr package
The tidyr package in R is used to mutate and visualize the data. It is used to tidy up the data. The package can be downloaded and installed into the working space using the following command:
install.packages("tidyr")
The separate method in R can be used to split up the specified string column or vector into corresponding sub-parts. The length of the second argument vector is equivalent to the number of pieces to split up the data into.
Syntax:
separate(str, n, pattern)
Parameter:
- str: The string vector to be split.
- n: The names of pieces to split the string into.
- pattern: Pattern to split up the string by.
Example: Split dataframe variable into multiple columns
R
library ( "tidyr" )
data_frame <- data.frame (
col1 = c ( "val_1" , "val_2" , "val_3" , "val_4" )
)
print ( "Original DataFrame" )
print (data_frame)
print ( "Modified DataFrame" )
data_frame %>%
separate (col1, c ( "col1" , "col2" ), "_" )
|
Output:
[1] "Original DataFrame"
col1
1 val_1
2 val_2
3 val_3
4 val_4
[1] "Modified DataFrame"
col1 col2
1 val 1
2 val 2
3 val 3
4 val 4
Method 3: Using stringr package
The stringr package in R is used to carry out string manipulations. It helps us perform modifications related to string. The package can be download and installed into the working space using the following command :
install.packages("stringr")
The str_split_fixed method in stringr package is used to split up a string into a fixed number of pieces. The method transforms strings into the specified number of substrings. The specified pattern should be of unit length.
Syntax:
str_split_fixed(str, pattern , n
Parameter :
- str: The string vector to be split.
- pattern: Pattern to split up the string by.
- n: The number of pieces to split the string into.
Example: Split dataframe variable into multiple columns
R
library ( "stringr" )
data_frame <- data.frame (
col1 = c ( "val_1" , "val_2" , "val_3" , "val_4" )
)
print ( "Original DataFrame" )
print (data_frame)
print ( "Modified DataFrame" )
str_split_fixed (data_frame$col1, "_" , 2)
|
Output:
[1] "Original DataFrame"
col1
1 val_1
2 val_2
3 val_3
4 val_4
[1] "Modified DataFrame"
[,1] [,2]
[1,] "val" "1"
[2,] "val" "2"
[3,] "val" "3"
[4,] "val" "4"
Last Updated :
19 Nov, 2021
Like Article
Save Article
Share your thoughts in the comments
Please Login to comment...