How to Remove Duplicate Rows in R DataFrame?
Last Updated :
15 Feb, 2022
In this article, we will discuss how to remove duplicate rows in dataframe in R programming language.
Dataset in use:
Method 1: Using distinct()
This method is available in dplyr package which is used to get the unique rows from the dataframe. We can remove rows from the entire which are duplicates and also we cab remove duplicate rows in a particular column.
Syntax:
distinct(dataframe)
distinct(dataframe,column1,column2,.,column n)
Example: R program to remove duplicate rows using distinct() function
R
library (dplyr)
data= data.frame (names= c ( "manoj" , "bobby" , "sravan" ,
"deepu" , "manoj" , "bobby" ) ,
id= c (1,2,3,4,1,2),
subjects= c ( "java" , "python" , "php" ,
"html" , "java" , "python" ))
print ( distinct (data))
print ( distinct (data,subjects))
print ( distinct (data,names))
|
Output:
Method 2: Using duplicated()
This function will return the duplicates from the dataframe, In order to get the unique rows, we have to specify ! operator before this method
Syntax:
data[!duplicated(data$column_name), ]
where,
- data is the input dataframe
- column_name is the column where duplicates are removed in this column
Example: R program to remove duplicate rows using duplicated() function
R
data= data.frame (names= c ( "manoj" , "bobby" , "sravan" ,
"deepu" , "manoj" , "bobby" ) ,
id= c (1,2,3,4,1,2),
subjects= c ( "java" , "python" , "php" ,
"html" , "java" , "python" ))
print (data[! duplicated (data$subjects), ])
print (data[! duplicated (data$names), ])
print (data[! duplicated (data$id), ])
|
Output:
Method 3 : Using unique()
This will get the unique rows from the dataframe.
Syntax:
unique(dataframe)
To get in a particular column
Syntax:
unique(dataframe$column_name
Example: R program to remove duplicate rows using unique() function
R
data= data.frame (names= c ( "manoj" , "bobby" , "sravan" ,
"deepu" , "manoj" , "bobby" ) ,
id= c (1,2,3,4,1,2),
subjects= c ( "java" , "python" , "php" ,
"html" , "java" , "python" ))
print ( unique (data$subjects))
print ( unique (data$names))
print ( unique (data$id))
|
Output:
[1] "java" "python" "php" "html"
[1] "manoj" "bobby" "sravan" "deepu"
[1] 1 2 3 4
Example: R program to apply unique() function in entire dataframe
R
data= data.frame (names= c ( "manoj" , "bobby" , "sravan" ,
"deepu" , "manoj" , "bobby" ) ,
id= c (1,2,3,4,1,2),
subjects= c ( "java" , "python" , "php" ,
"html" , "java" , "python" ))
print ( unique (data))
|
Output:
Share your thoughts in the comments
Please Login to comment...