Skip to content
Related Articles

Related Articles

Improve Article

Drop multiple columns using Dplyr package in R

  • Last Updated : 21 Jul, 2021
Geek Week

In this article, we will discuss how to drop multiple columns using dplyr package in R programming language.

Dataset in use:

Drop multiple columns by using the column name

We can remove a column with select() method by its column name

Syntax:



select(dataframe,-c(column_name1,column_name2,.,column_name n)

Where, dataframe is the input dataframe and -c(column_names) is the collection of names of the column to be removed.

Example: R program to remove multiple columns by column name

R




# load the library
library(dplyr)
  
# create dataframe with 3 columns id,
# name and address
data1=data.frame(id=c(1,2,3,4,5,6,7,1,4,2),
                   
                 name=c('sravan','ojaswi','bobby',
                        'gnanesh','rohith','pinkey',
                        'dhanush','sravan','gnanesh',
                        'ojaswi'),
                   
                 address=c('hyd','hyd','ponnur','tenali',
                           'vijayawada','vijayawada','guntur',
                           'hyd','tenali','hyd'))
  
# remove name and id  column
print(select(data1,-c(id,name)))
  
# remove name and address column
print(select(data1,-c(address,name)))
  
# remove all column
print(select(data1,-c(address,name,id)))

Output:

Drop multiple columns by using column index

We can remove a column with select() method by its column index/position. Index starts with 1.

Syntax:



select(dataframe,-c(column_index1,column_index2,.,column_index n)

Where, dataframe is the input dataframe and c(column_indexes) is the position of the columns to be removed.

Example: R program to remove multiple columns by position

R




# load the library
library(dplyr)
  
# create dataframe with 3 columns
# id,name and address
data1=data.frame(id=c(1,2,3,4,5,6,7,1,4,2),
                   
                 name=c('sravan','ojaswi','bobby',
                        'gnanesh','rohith','pinkey',
                        'dhanush','sravan','gnanesh',
                        'ojaswi'),
                   
                 address=c('hyd','hyd','ponnur','tenali',
                           'vijayawada','vijayawada','guntur',
                           'hyd','tenali','hyd'))
  
# remove name and id  columns by 
# its position
print(select(data1,-c(1,2)))

Output:

Drop column which contains a value or matches a pattern

Let’s see how to remove the column that contains the character/string.

Method 1: Using contains()

Display the column that contains the given substring and then -contains() removes the column that contains the given substring.

Syntax:

select(dataframe,-contains(‘sub_string’))



Here, dataframe is the input dataframe and the sub_string is the string present in the column name that will be removed.

Method 2: Using matches()

Display the column that contains the given substring and then -matches() removes the column that contains the given substring

Syntax:

select(dataframe,-matches(‘sub_string’))

Here, dataframe is the input dataframe and the sub_string is the string present in the column name that will be removed.

Example: R program that removes column using contains() method

R




# load the library
library(dplyr)
  
# create dataframe with 3 columns 
# id,name and address
data1=data.frame(id=c(1,2,3,4,5,6,7,1,4,2),
                   
                 name=c('sravan','ojaswi','bobby',
                        'gnanesh','rohith','pinkey',
                        'dhanush','sravan','gnanesh',
                        'ojaswi'),
                   
                 address=c('hyd','hyd','ponnur','tenali',
                           'vijayawada','vijayawada','guntur',
                           'hyd','tenali','hyd'))
  
  
# remove column that contains na
print(select(data1,-contains('na')))
      
# remove column that contains re
print(select(data1,-contains('re')))

Output:

Remove column which starts with or ends with certain character

Here we can also select columns based on starting and ending characters.



  • starts_with() is used to return the column that starts with the given character and -starts_with() is used to remove  the column that starts with the given character.

Syntax:

select(dataframe,-starts_with(‘substring’))

Where, dataframe is the input dataframe and substring is the character/string that starts with it

  • ends_with() is used to return the column that ends with the given character and -ends_with() is used to remove the column that ends with the given character.

Syntax:

select(dataframe,-ends_with(‘substring’))

Where, dataframe is the input dataframe and substring is the character/string that ends with it.

Example 1: R program to remove a column that starts with character/substring

R




# load the library
library(dplyr)
  
# create dataframe with 3 columns 
# id,name and address
data1=data.frame(id=c(1,2,3,4,5,6,7,1,4,2),
                   
                 name=c('sravan','ojaswi','bobby',
                        'gnanesh','rohith','pinkey',
                        'dhanush','sravan','gnanesh',
                        'ojaswi'),
                   
                 address=c('hyd','hyd','ponnur','tenali',
                           'vijayawada','vijayawada','guntur',
                           'hyd','tenali','hyd'))
  
  
# remove column that starts with na
print(select(data1,-starts_with('na')))
      
# remove column that starts with ad
print(select(data1,-starts_with('ad')))

Output:



Example 2: R program to remove column that ends with character/substring

R




# load the library
library(dplyr)
  
# create dataframe with 3 columns 
# id,name and address
data1=data.frame(id=c(1,2,3,4,5,6,7,1,4,2),
                   
                 name=c('sravan','ojaswi','bobby',
                        'gnanesh','rohith','pinkey',
                        'dhanush','sravan','gnanesh',
                        'ojaswi'),
                   
                 address=c('hyd','hyd','ponnur','tenali',
                           'vijayawada','vijayawada','guntur',
                           'hyd','tenali','hyd'))
  
  
# remove column that ends with d
print(select(data1,-ends_with('d')))
      
# remove column that starts with ss
print(select(data1,-ends_with('ss')))

Output:

Drop column name with Regular Expression

Here we are going to drop the column based on the pattern given in grepl() function. It will find a pattern and remove the column based on the given pattern

Syntax:

dataframe[,!grepl(“pattern”,names(dataframe))]

Here, dataframe is the input dataframe and pattern is the expression to remove the column.

Pattern to remove the column where starting character in column starts is

Syntax:

data[,!grepl(“^letter”,names(data))]

Example: R program to remove column that starts with a letter

R




# load the library
library(dplyr)
  
# create dataframe with 3 columns 
# id,name and address
data1=data.frame(id=c(1,2,3,4,5,6,7,1,4,2),
                   
                 name=c('sravan','ojaswi','bobby',
                        'gnanesh','rohith','pinkey',
                        'dhanush','sravan','gnanesh',
                        'ojaswi'),
                   
                 address=c('hyd','hyd','ponnur','tenali',
                           'vijayawada','vijayawada','guntur',
                           'hyd','tenali','hyd'))
  
  
# drop  column that starts with n
print(data1[,!grepl("^n",names(data1))])
      
# remove column that starts with a
print(data1[,!grepl("^a",names(data1))])

Output:




My Personal Notes arrow_drop_up
Recommended Articles
Page :