Open In App

How to find duplicate values in a list in R

Last Updated : 12 Apr, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will see how to find duplicate values in a list in the R Programming Language in different scenarios.

Finding duplicate values in a List

In R, the duplicated() function is used to find the duplicate values present in the R objects. This function determines which elements of a List are duplicates and returns a logical vector (Holds TRUE/FALSE values) indicating which elements are duplicates. TRUE is returned if the element already exists. Otherwise, FALSE will be returned.

Syntax:

duplicated(List_name)

Here, List_name is the input list.

Let’s have a list with 10 values and find the duplicate values.

R
# Create a List
List_data =list(1,2,3,4,5,6,7,5,4,3)
print(List_data)

# Find duplicates in the above List
print(duplicated(List_data))

Output:

[[1]]
[1] 1

[[2]]
[1] 2

[[3]]
[1] 3

[[4]]
[1] 4

[[5]]
[1] 5

[[6]]
[1] 6

[[7]]
[1] 7

[[8]]
[1] 5

[[9]]
[1] 4

[[10]]
[1] 3

[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE

We can see that last three elements in the List are duplicated. So TRUE is returned for them.

Let’s have a list that hold 2 lists and find duplicates in each of the list separately.

R
# Create a List with 2 lists
List_data =list(list1=list(100,200,300,300,300),
                list2=list("Java","HTML","PHP","JSP","Statistics"))
print(List_data)

# Find duplicates in list1 from List_data
print(duplicated(List_data$list1))

# Find duplicates in list2 from List_data
print(duplicated(List_data$list2))

Output:

$list1
$list1[[1]]
[1] 100

$list1[[2]]
[1] 200

$list1[[3]]
[1] 300

$list1[[4]]
[1] 300

$list1[[5]]
[1] 300


$list2
$list2[[1]]
[1] "Java"

$list2[[2]]
[1] "HTML"

$list2[[3]]
[1] "PHP"

$list2[[4]]
[1] "JSP"

$list2[[5]]
[1] "Statistics"

[1] FALSE FALSE FALSE TRUE TRUE

[1] FALSE FALSE FALSE FALSE FALSE

There are two duplicates in list1.

Let’s create a List having three vectors and find the duplicates in each vector.

R
# Create a List with 3 vectors
List_data =list(Id=c(1,2,3,4,5,4,5),Subject=c("Java","HTML","HTML","Python"),
                Marks=c(100,89,78,69,80))
print(List_data)

# Find duplicates in the Id
duplicated(List_data$Id)

# Find duplicates in the Subject
duplicated(List_data$Subject)

# Find duplicates in the Marks
duplicated(List_data$Marks)

Output:

$Id
[1] 1 2 3 4 5 4 5

$Subject
[1] "Java" "HTML" "HTML" "Python"

$Marks
[1] 100 89 78 69 80

[1] FALSE FALSE FALSE FALSE FALSE TRUE TRUE

[1] FALSE FALSE TRUE FALSE

[1] FALSE FALSE FALSE FALSE FALSE
  1. Id holds two duplicate values i.e 4 and 5
  2. Subject holds one duplicate value i.e “HTML”
  3. There are no duplicates in the Marks vector.

Let’s create a List having 2 vectors and return total number of duplicate elements. To do this we need to use the sum() function and pass the duplicated() function as a parameter to it.

R
# Create a List with 2 vectors
List_data =list(Id=c(1,2,3,4,5,4,5),Subject=c("Java","HTML","HTML","Python"))
print(List_data)

# Find duplicates in the Id
sum(duplicated(List_data$Id))

# Find duplicates in the Subject
sum(duplicated(List_data$Subject))

Output:

$Id
[1] 1 2 3 4 5 4 5

$Subject
[1] "Java" "HTML" "HTML" "Python"

[1] 2

[1] 1

There are 2 duplicates in the Id vector and one duplicate in the Subject vector.

Conclusion

In conclusion, identifying duplicate values in a list in R is essential for data cleaning and quality assurance. By utilizing various methods such as the duplicated() function we can efficiently detect and handle duplicate values.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads