Drop multiple columns using Dplyr package in R

Last Updated : 21 Jul, 2021

In this article, we will discuss how to drop multiple columns using dplyr package in R programming language.

Dataset in use:

Drop multiple columns by using the column name

We can remove a column with select() method by its column name

Syntax:

select(dataframe,-c(column_name1,column_name2,.,column_name n)

Where, dataframe is the input dataframe and -c(column_names) is the collection of names of the column to be removed.

Example: R program to remove multiple columns by column name

R

# load the library 
library(dplyr) 
  
# create dataframe with 3 columns id, 
# name and address 
data1=data.frame(id=c(1,2,3,4,5,6,7,1,4,2), 
                   
                 name=c('sravan','ojaswi','bobby', 
                        'gnanesh','rohith','pinkey', 
                        'dhanush','sravan','gnanesh', 
                        'ojaswi'), 
                   
                 address=c('hyd','hyd','ponnur','tenali', 
                           'vijayawada','vijayawada','guntur', 
                           'hyd','tenali','hyd')) 
  
# remove name and id  column 
print(select(data1,-c(id,name))) 
  
# remove name and address column 
print(select(data1,-c(address,name))) 
  
# remove all column 
print(select(data1,-c(address,name,id))) 

Output:

Drop multiple columns by using column index

We can remove a column with select() method by its column index/position. Index starts with 1.

Syntax:

select(dataframe,-c(column_index1,column_index2,.,column_index n)

Where, dataframe is the input dataframe and c(column_indexes) is the position of the columns to be removed.

Example: R program to remove multiple columns by position

R

# load the library 
library(dplyr) 
  
# create dataframe with 3 columns 
# id,name and address 
data1=data.frame(id=c(1,2,3,4,5,6,7,1,4,2), 
                   
                 name=c('sravan','ojaswi','bobby', 
                        'gnanesh','rohith','pinkey', 
                        'dhanush','sravan','gnanesh', 
                        'ojaswi'), 
                   
                 address=c('hyd','hyd','ponnur','tenali', 
                           'vijayawada','vijayawada','guntur', 
                           'hyd','tenali','hyd')) 
  
# remove name and id  columns by  
# its position 
print(select(data1,-c(1,2))) 

Output:

Drop column which contains a value or matches a pattern

Let’s see how to remove the column that contains the character/string.

Method 1: Using contains()

Display the column that contains the given substring and then -contains() removes the column that contains the given substring.

Syntax:

select(dataframe,-contains(‘sub_string’))

Here, dataframe is the input dataframe and the sub_string is the string present in the column name that will be removed.

Method 2: Using matches()

Display the column that contains the given substring and then -matches() removes the column that contains the given substring

Syntax:

select(dataframe,-matches(‘sub_string’))

Here, dataframe is the input dataframe and the sub_string is the string present in the column name that will be removed.

Example: R program that removes column using contains() method

R

# load the library 
library(dplyr) 
  
# create dataframe with 3 columns  
# id,name and address 
data1=data.frame(id=c(1,2,3,4,5,6,7,1,4,2), 
                   
                 name=c('sravan','ojaswi','bobby', 
                        'gnanesh','rohith','pinkey', 
                        'dhanush','sravan','gnanesh', 
                        'ojaswi'), 
                   
                 address=c('hyd','hyd','ponnur','tenali', 
                           'vijayawada','vijayawada','guntur', 
                           'hyd','tenali','hyd')) 
  
  
# remove column that contains na 
print(select(data1,-contains('na'))) 
      
# remove column that contains re 
print(select(data1,-contains('re')))

Output:

Remove column which starts with or ends with certain character

Here we can also select columns based on starting and ending characters.

starts_with() is used to return the column that starts with the given character and -starts_with() is used to remove the column that starts with the given character.

Syntax:

select(dataframe,-starts_with(‘substring’))

Where, dataframe is the input dataframe and substring is the character/string that starts with it

ends_with() is used to return the column that ends with the given character and -ends_with() is used to remove the column that ends with the given character.

Syntax:

select(dataframe,-ends_with(‘substring’))

Where, dataframe is the input dataframe and substring is the character/string that ends with it.

Example 1: R program to remove a column that starts with character/substring

R

# load the library 
library(dplyr) 
  
# create dataframe with 3 columns  
# id,name and address 
data1=data.frame(id=c(1,2,3,4,5,6,7,1,4,2), 
                   
                 name=c('sravan','ojaswi','bobby', 
                        'gnanesh','rohith','pinkey', 
                        'dhanush','sravan','gnanesh', 
                        'ojaswi'), 
                   
                 address=c('hyd','hyd','ponnur','tenali', 
                           'vijayawada','vijayawada','guntur', 
                           'hyd','tenali','hyd')) 
  
  
# remove column that starts with na 
print(select(data1,-starts_with('na'))) 
      
# remove column that starts with ad 
print(select(data1,-starts_with('ad')))

Output:

Example 2: R program to remove column that ends with character/substring

R

# load the library 
library(dplyr) 
  
# create dataframe with 3 columns  
# id,name and address 
data1=data.frame(id=c(1,2,3,4,5,6,7,1,4,2), 
                   
                 name=c('sravan','ojaswi','bobby', 
                        'gnanesh','rohith','pinkey', 
                        'dhanush','sravan','gnanesh', 
                        'ojaswi'), 
                   
                 address=c('hyd','hyd','ponnur','tenali', 
                           'vijayawada','vijayawada','guntur', 
                           'hyd','tenali','hyd')) 
  
  
# remove column that ends with d 
print(select(data1,-ends_with('d'))) 
      
# remove column that starts with ss 
print(select(data1,-ends_with('ss')))

Output:

Drop column name with Regular Expression

Here we are going to drop the column based on the pattern given in grepl() function. It will find a pattern and remove the column based on the given pattern

Syntax:

dataframe[,!grepl(“pattern”,names(dataframe))]

Here, dataframe is the input dataframe and pattern is the expression to remove the column.

Pattern to remove the column where starting character in column starts is

Syntax:

data[,!grepl(“^letter”,names(data))]

Example: R program to remove column that starts with a letter

R

# load the library 
library(dplyr) 
  
# create dataframe with 3 columns  
# id,name and address 
data1=data.frame(id=c(1,2,3,4,5,6,7,1,4,2), 
                   
                 name=c('sravan','ojaswi','bobby', 
                        'gnanesh','rohith','pinkey', 
                        'dhanush','sravan','gnanesh', 
                        'ojaswi'), 
                   
                 address=c('hyd','hyd','ponnur','tenali', 
                           'vijayawada','vijayawada','guntur', 
                           'hyd','tenali','hyd')) 
  
  
# drop  column that starts with n 
print(data1[,!grepl("^n",names(data1))]) 
      
# remove column that starts with a 
print(data1[,!grepl("^a",names(data1))])