Open In App

Python | Get duplicate tuples from list

Last Updated : 27 Feb, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Sometimes, while working with records, we can have a problem of extracting those records which occur more than once. This kind of application can occur in web development domain. Let’s discuss certain ways in which this task can be performed.

 Method #1 : Using list comprehension + set() + count() Initial approach that can be applied is that we can iterate on each tuple and check it’s count in list using count(), if greater than one, we can add to list. To remove multiple additions, we can convert the result to set using set(). 

Python3




# Python3 code to demonstrate working of
# Get duplicate tuples from list
# Using list comprehension + set() + count()
 
# initialize list
test_list = [(3, 4), (4, 5), (3, 4),
            (3, 4), (4, 5), (6, 7)]
 
# printing original list
print("The original list : " + str(test_list))
 
# Get duplicate tuples from list
# Using list comprehension + set() + count()
res = list(set([ele for ele in test_list
            if test_list.count(ele) > 1]))
 
# printing result
print("All the duplicates from list are : " + str(res))


Output

The original list : [(3, 4), (4, 5), (3, 4), (3, 4), (4, 5), (6, 7)]
All the duplicates from list are : [(4, 5), (3, 4)]

Time Complexity: O(n^2)
Auxiliary Space: O(n)

  Method #2 : Using Counter() + items() + list comprehension The combination of above functions can also be used to perform this particular task. In this, we just get the count of each occurrence of element using Counter() as dictionary and then extract all those whose value is above 1. 

Python3




# Python3 code to demonstrate working of
# Get duplicate tuples from list
# Using list comprehension + Counter() + items()
from collections import Counter
 
# initialize list
test_list = [(3, 4), (4, 5), (3, 4),
            (3, 4), (4, 5), (6, 7)]
 
# printing original list
print("The original list : " + str(test_list))
 
# Get duplicate tuples from list
# Using list comprehension + Counter() + items()
res = [ele for ele, count in Counter(test_list).items()
                                        if count > 1]
 
# printing result
print("All the duplicates from list are : " + str(res))


Output : 

The original list : [(3, 4), (4, 5), (3, 4), (3, 4), (4, 5), (6, 7)]
All the duplicates from list are : [(4, 5), (3, 4)]

Time Complexity: O(n)
Auxiliary Space: O(n)

Method#3: Using dictionary. we can iterate through the list of tuples, and for each tuple, check if it exists in the dictionary. If it does, it means the tuple is a duplicate, and you can add it to a separate list of duplicate tuples. If it doesn’t, you can add it to the dictionary as a key with a value of 1.

Python3




# Initialize list
test_list = [(3, 4), (4, 5), (3, 4), (4, 5), (6, 7)]
print('The original list :' + str(test_list))
# Initialize an empty dictionary
d = {}
 
# Initialize an empty list to store duplicate tuples
duplicates = []
 
# Iterate through the list of tuples
for tup in test_list:
    if tup in d:
        # If the tuple already exists in the dictionary, it's a duplicate
        duplicates.append(tup)
    else:
        # If the tuple doesn't exist in the dictionary, add it
        d[tup] = 1
 
# Print the list of duplicate tuples
print('All the duplicates from list are :'+ str(duplicates))
 
#this code contributed by tvsk


Output

The original list :[(3, 4), (4, 5), (3, 4), (4, 5), (6, 7)]
All the duplicates from list are :[(3, 4), (4, 5)]

Time Complexity: O(n), where n in the number of elements in list
Auxiliary Space: O(n), as it creates a dictionary with n elements. 

Method#4: Using for loop

Python3




test_list = [(3, 4), (4, 5), (3, 4),
            (3, 4), (4, 5), (6, 7)]
 
# printing original list
print("The original list : " + str(test_list))
 
# using for loop
res = []
for i in range(len(test_list)):
    for j in range(i+1, len(test_list)):
        if test_list[i] == test_list[j] and test_list[i] not in res:
            res.append(test_list[i])
 
# printing result
print("All the duplicates from list are : " + str(res))
#This code is contributed by Vinay pinjala.


Output

The original list : [(3, 4), (4, 5), (3, 4), (3, 4), (4, 5), (6, 7)]
All the duplicates from list are : [(3, 4), (4, 5)]

Time Complexity: O(n)
Auxiliary Space: O(n)

Method#5: Using the groupby function from the itertools  :

This code uses a tool called groupby() to group similar things in a list of tuples together. It sorts the list of tuples so that duplicates are next to each other, and then groups them by the values in the tuples.

The code then looks for groups with more than one element (i.e., duplicates), and keeps track of the tuples that appear more than once in a new list. Finally, it prints the list of duplicate tuples.

In essence, the code sorts the input list of tuples, groups them by their values, filters the groups to include only duplicates, and outputs the duplicates in a new list.

Python3




from itertools import groupby
 
# Define the input list of tuples
test_list = [(3, 4), (4, 5), (3, 4), (3, 4), (4, 5), (6, 7)]
# printing original list
print("The original list : " + str(test_list))
  
 
# Sort the input list of tuples to group duplicates together
test_list.sort()
 
# Use the groupby() function to group the sorted list by each element of the tuple
# This creates an iterator that returns a tuple of the form (key, group) for each group of duplicates
grouped = groupby(test_list)
 
# Filter the iterator to include only groups with more than one element (i.e., duplicates)
duplicates = [key for key, group in grouped if len(list(group)) > 1]
 
# Print the result
print(duplicates)
 
#This code is contributed by Jyothi pinjala.


Output

The original list : [(3, 4), (4, 5), (3, 4), (3, 4), (4, 5), (6, 7)]
[(3, 4), (4, 5)]

Time Complexity: O(n log n)
Auxiliary Space: O(n)Method#5

Method#6: Using defaultdict

Algorithm:

  1. Initialize an empty dictionary freq_dict to store the frequency of each tuple in the list.
  2. Iterate over each tuple tpl in the test_list.
  3. Append the tuple to the list corresponding to its frequency in freq_dict.
  4. Get a list of tuples with a frequency greater than 1.
  5. Flatten the list of tuples to get the duplicate tuples.
  6. Remove duplicates from the list of duplicate tuples.
  7. Return the list of duplicate tuples.

Python3




from collections import defaultdict
 
# initialize list
test_list = [(3, 4), (4, 5), (3, 4), (3, 4), (4, 5), (6, 7)]
 
# printing original list
print("The original list : " + str(test_list))
 
# Using defaultdict to group tuples by frequency
freq_dict = defaultdict(list)
for tpl in test_list:
    freq_dict[tpl].append(tpl)
 
# Get duplicate tuples
res = [tpls for tpls in freq_dict.values() if len(tpls) > 1]
 
# Flatten the list of tuples to get the duplicate tuples
res = [tpl for tpls in res for tpl in tpls]
 
# Remove duplicates from the list of duplicate tuples
res = list(set(res))
res.sort()
# printing result
print("All the duplicates from list are : " + str(res))


Output

The original list : [(3, 4), (4, 5), (3, 4), (3, 4), (4, 5), (6, 7)]
All the duplicates from list are : [(3, 4), (4, 5)]

Time Complexity:
The time complexity of this approach is O(n), where n is the length of the input list. This is because we need to iterate over each tuple in the list once to create the frequency dictionary, and then we need to iterate over the list of tuples with a frequency greater than 1 to get the duplicate tuples.

Auxiliary Space:
The auxiliary space complexity of this approach is O(n), where n is the length of the input list. This is because we need to create a dictionary with a key for each unique tuple in the input list, which could be as large as the length of the input list. However, the actual space used by the dictionary will depend on the number of unique tuples in the input list, so the space complexity could be less than O(n) in practice.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads