Count the number of duplicates in R
In this article, we will see how to find out the number of duplicates in R Programming language.
It can be done with two methods:
- Using duplicated() function.
- Using algorithm.
Method 1: Using duplicated()
Here we will use duplicated() function of R and dplyr functions.
- Insert the “library(tidyverse)” package to the program.
- Create a data frame or a vector.
- Use the duplicated() function and check for the duplicate data.
Parameters: x: Data frame or a vector
Example 1: Finding duplicate in vector.
Let’s first create a vector and find the position of the duplicate elements in x.
Extract the duplicate elements in x.
Here we can see all the elements which are duplicated.
Example 2: Finding duplicate in Dataframe.
Let’s now create a data frame.
Here we have a data frame and some items are duplicated, so we have to find the duplicated elements in this data frame.
We will check which column has the duplicated data.
So now find out in emp_id column how many duplicated elements are there.
We can see all the duplicated elements in column emp_id.
Method 2: Using algorithm.
Lets us assume we have a data frame with duplicate data, and we have to find out the number of duplicates in that data frame.
We can see clearly we have calculated the number of duplicates in the data frame.