Open In App

Remove All Whitespace in Each DataFrame Column in R

Last Updated : 02 Jun, 2022
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will learn how to remove all whitespace in each dataframe column in R programming language.

Sample dataframe in use:

           c1     c2
1   geeks for geeks 
2          cs     f 
3  r   -lang       g

Method 1: Using gsub()

In this approach, we have used apply() function to apply a function to each row in a data frame. The function used which is applied to each row in the dataframe is the gsub() function, this used to replace all the matches of a pattern from a string, we have used to gsub() function to find whitespace(\s), which is then replaced by “”, this removes the whitespaces.

Note: We have wrapped our entire output in as.data.frame() function, it is because the apply() function returns a Matrix object so we need to convert it back into a dataframe.

Syntax: as.data.frame(apply(df,margin, function(x) gsub(“\\s+”, “”, x)))

Parameters:

df: Dataframe object

margin: dimension on which operation is to be applied

function(x): operation to be applied, gsub() in this case.

gsub(): replaces “\s” with “”

Example: R program to remove whitespaces using gsub()

R




df <- data.frame(c1 = c("  geeks for", "  cs", "r   -lang "),
                 c2 = c("geeks ", "f ", "  g"))
 
df_new <- as.data.frame(
  apply(df,2, function(x) gsub("\\s+", "", x)))
 
df_new


Output:

        c1    c2

1 geeksfor geeks

2       cs     f

3   r-lang     g

Method 2: Using str_remove_all()

We need to first install the package “stringr” by using install.packages() command and then import it using library() function.

str_remove_all() function takes 2 arguments, first the entire string on which the removal operation is to be performed and the character whose all the occurrences are to be removed.

Syntax: str_remove_all(string, char_to_remove)

Parameter:

string: entire string

char_to_remove: character which is to be removed from the string

Example: R program to remove whitespaces using str_remove_all()

R




library("stringr")                                
 
str <- " Welcome   to Geeks for Geeks "
str_remove_all(str," ")


Output:

[1] “WelcometoGeeksforGeeks”

Since we have understood the str_remove_all() function so let’s move on to the approach where we will be applying this function to all the rows of the Dataframe.

Syntax: as.data.frame(apply(df,margin, str_remove_all, ” “))

Parameters:

df: Dataframe object

margin: dimension on which operation is to be applied

str_remove_all: operation to be applied

In this approach, we have used apply() function to apply a function to each row in a data frame. The function used which is applied to each row in the dataframe is the str_remove_all() function. We have passed whitespace ” ” as an argument, this function removes all the occurrences of ” “, from each row. 

Note: We have wrapped our entire output in as.data.frame() function, it is because the apply() function returns a Matrix object so we need to convert it back into a dataframe.

Example: R program to remove whitespaces from dataframe using str_remove_all()

R




library("stringr")                          
 
df <- data.frame(c1 = c("  geeks for", "  cs", "r   -lang "),
                 c2 = c("geeks ", "f ", "  g"))
 
df_new <-as.data.frame(apply(df,2, str_remove_all, " "))
 
df_new


Output:

        c1    c2

1 geeksfor geeks

2       cs     f

3   r-lang     g

Method 3: Using str_replace_all()

str_replace_all() function takes 3 arguments. First, it takes the input string on which the operation has to be performed. Then it takes the pattern which is to be replaced and the replacement value with which it is to be replaced. Here we have the pattern ” “ is replaced by “”

Syntax: as.data.frame(apply(df,2, function(x) str_replace_all(string=x, pattern=” “, repl=””)))

Parameters:

df: Dataframe object

margin: dimension on which operation is to be applied

function(x): operation to be applied, str_replace_all() in this case.

str_replace_all(): replaces all the occurrences of ” ” with “”

In this approach, we have used apply() function to apply a function to each row in a data frame. The function used which is applied to each row in the dataframe is the str_replace_all() function, this used to replace all the matches of a pattern from a string, we have used to str_replace_all() function to find whitespace(” “), which is then replaced by “”, this removes the whitespaces.

Note: We have wrapped our entire output in as.data.frame() function, it is because the apply() function returns a Matrix object so we need to convert it back into a dataframe.

Example: R program to remove whitespaces using str_replace_all()

R




library(stringr)
 
df <- data.frame(c1 = c("  geeks for", "  cs", "r   -lang "),
                 c2 = c("geeks ", "f ", "  g"))
 
df_new <-as.data.frame(apply(df,2,
                             function(x) str_replace_all(string=x,
                                               pattern=" ", repl="")))
 
df_new


Output:

        c1    c2

1 geeksfor geeks

2       cs     f

3   r-lang     g



Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads