Python Program to Find Duplicate sets in list of sets

Last Updated : 18 Apr, 2023

Given a list of sets, the task is to write a Python program to find duplicate sets.

Input : test_list = [{4, 5, 6, 1}, {6, 4, 1, 5}, {1, 3, 4, 3}, {1, 4, 3}, {7, 8, 9}]
Output : [frozenset({1, 4, 5, 6}), frozenset({1, 3, 4})]
Explanation : {1, 4, 5, 6} is similar to {6, 4, 1, 5} hence part of result.

Input : test_list = [{4, 5, 6, 9}, {6, 4, 1, 5}, {1, 3, 4, 3}, {1, 4, 3}, {7, 8, 9}]
Output : [frozenset({1, 3, 4})]
Explanation : {1, 3, 4} ({1, 3, 4, 3}) is similar to {1, 4, 3} hence part of result.

Method #1: Using Counter() + count() + frozenset() + loop

In this, all the sets are hashed by converting them into frozenset() [ to get hashable type ] into frequency using Counter(). Then count() is used to get count of all the present sets from frequency counter created.

Step-by-step approach:

Import the necessary modules: Counter from the collections module as this will be used to count the frequency of items in the list.
Initialize a list test_list
Each set has some integers in it.
Print the original list using print().
Use a list comprehension to create a new list freqs of frozen sets. Each frozen set is obtained by iterating through test_list, converting each set into a frozen set using frozenset().
The Counter() function is then used to count the frequency of each frozen set.
Initialize an empty list res.
Iterate through the items in freqs using a for loop.
- For each item, extract the key and value.
  - If the value is greater than 1, the key is a duplicate set, so it is appended to res.
Print the final result using print().
The program execution is complete.

Below is the implementation of the above approach:

Python3

# Python3 code to demonstrate working of
# Duplicate sets in list of sets
# Using Counter() + count() + frozenset() + loop
from collections import Counter
 
# initializing list
test_list = [{4, 5, 6, 1}, {6, 4, 1, 5}, {1, 3, 4, 3}, 
             {1, 4, 3}, {7, 8, 9}]
              
# printing original list
print("The original list is : " + str(test_list))
 
# getting frequency using Counter()
freqs = Counter(frozenset(sub) for sub in test_list)
 
res = []
for key, val in freqs.items():
     
    # if frequency greater than 1, set is appended 
    # [duplicate]
    if val > 1 :
        res.append(key)
 
# printing result
print("Duplicate sets list : " + str(res))

Output

The original list is : [{1, 4, 5, 6}, {1, 4, 5, 6}, {1, 3, 4}, {1, 3, 4}, {8, 9, 7}]
Duplicate sets list : [frozenset({1, 4, 5, 6}), frozenset({1, 3, 4})]

Time complexity: O(n), where n is the length of the input list.
Auxiliary space: O(n), where n is the length of the input list.

Method #2 : Using list comprehension + Counter()

In this, we perform similar task, only difference being list comprehension is used as one liner to extract duplicates based on frequency dictionary.

Import the Counter class from the collections module.
Initialize a list of sets called test_list with 5 sets containing different elements.
Print the original list.
Use a list comprehension to create a new list of frozen sets freqs that contains the frequency of each set in test_list.
*For each set sub in test_list, a frozen set is created and added to freqs using the Counter() function.
Use another list comprehension to create a new list res that contains only the sets from freqs that occur more than once.
*For each key, value pair in freqs, the key is added to res only if the value is greater than 1.
Print the final result, which is a list of all the sets in test_list that occur more than once.

Python3

# Python3 code to demonstrate working of
# Duplicate sets in list of sets
# Using list comprehension + Counter()
from collections import Counter
 
# initializing list
test_list = [{4, 5, 6, 1}, {6, 4, 1, 5}, {1, 3, 4, 3}, {1, 4, 3}, {7, 8, 9}]
              
# printing original list
print("The original list is : " + str(test_list))
 
# getting frequency using Counter()
freqs = Counter(frozenset(sub) for sub in test_list)
 
# list comprehension provides shorthand solution
res = [key for key, val in freqs.items() if val > 1]
 
# printing result
print("Duplicate sets list : " + str(res))

Output

The original list is : [{1, 4, 5, 6}, {1, 4, 5, 6}, {1, 3, 4}, {1, 3, 4}, {8, 9, 7}]
Duplicate sets list : [frozenset({1, 4, 5, 6}), frozenset({1, 3, 4})]

Time Complexity: O(n), where n is the length of the test_list.
Auxiliary Space: O(n), where n is the length of the test_list.

Method #3 : Using nested loop

Python3

# initializing list
test_list = [{4, 5, 6, 1}, {6, 4, 1, 5}, {1, 3, 4, 3}, 
             {1, 4, 3}, {7, 8, 9}]
# printing original list
print("The original list is : " + str(test_list))
# creating an empty set to store the result
result = set()
# nested for loop to compare sets
for i in range(len(test_list)):
    for j in range(i+1, len(test_list)):
        if test_list[i] == test_list[j]:
            result.add(frozenset(test_list[i]))
# printing result
print("Duplicate sets list : " + str(list(result)))
#This code is contributed by Jyothi pinjala

Output

The original list is : [{1, 4, 5, 6}, {1, 4, 5, 6}, {1, 3, 4}, {1, 3, 4}, {8, 9, 7}]
Duplicate sets list : [frozenset({1, 4, 5, 6}), frozenset({1, 3, 4})]

Time Complexity: O(n^2)
Auxiliary Space: O(n)

Method #4: Using set() and filter()

This method uses the set() function to get a set of unique sets. The map() function is used to convert each set in test_list to a frozenset, which is hashable and can be used as a key in a set or dictionary. The filter() function is then used to filter out the unique sets and return only the duplicate sets. The lambda function checks if the count of the set in test_list is greater than 1. Finally, the list of duplicate sets is printe

Python3

# Python3 code to demonstrate working of
# Duplicate sets in list of sets
# Using set() and filter()
 
# initializing list
test_list = [{4, 5, 6, 1}, {6, 4, 1, 5}, {1, 3, 4, 3}, 
             {1, 4, 3}, {7, 8, 9}]
              
# printing original list
print("The original list is : " + str(test_list))
 
# using set() to get unique sets
unique_sets = set(map(frozenset, test_list))
 
# using filter() to get duplicate sets
duplicate_sets = list(filter(lambda s: test_list.count(s) > 1, unique_sets))
 
# printing result
print("Duplicate sets list : " + str(duplicate_sets))

Output

The original list is : [{1, 4, 5, 6}, {1, 4, 5, 6}, {1, 3, 4}, {1, 3, 4}, {8, 9, 7}]
Duplicate sets list : [frozenset({1, 4, 5, 6}), frozenset({1, 3, 4})]

Time Complexity: The time complexity of this code is O(n^2), where n is the length of the input list test_list.
Auxiliary Space: The auxiliary space complexity of this code is O(n), where n is the length of the input list test_list.

Method #5: Using defaultdict

This approach uses a defaultdict to store the sets and their counts. Loop through the input list of sets and add each set to the defaultdict. The frozenset is used as the key since sets are not hashable, but frozensets are. Then loop through the defaultdict to get the sets with count greater than 1, indicating duplicates. The result is a list of duplicate sets.

Python3

from collections import defaultdict
 
test_list = [{4, 5, 6, 1}, {6, 4, 1, 5}, {1, 3, 4, 3}, {1, 4, 3}, {7, 8, 9}]
 
# create defaultdict to store sets with their count
set_counts = defaultdict(int)
for s in test_list:
    set_counts[frozenset(s)] += 1
 
# get sets with count greater than 1
duplicate_sets = [set for set, count in set_counts.items() if count > 1]
 
# print result
print("Duplicate sets list : " + str(duplicate_sets))

Output

Duplicate sets list : [frozenset({1, 4, 5, 6}), frozenset({1, 3, 4})]

Time complexity: O(N*M), where N is the length of test_list and M is the size of the largest set in the list.
Auxiliary space: O(N*M), where N is the length of test_list and M is the size of the largest set in the list.

Method #6: Using Hashing with Dictionary

Use a dictionary to keep track of the sets that we have already encountered while iterating through the list. We will add the sets to the dictionary as keys and set their values to 1. If we come across a set that is already in the dictionary, we will append it to the result list.

Python3

# Python3 code to demonstrate working of
# Duplicate sets in list of sets
# Using Hashing with Dictionary
 
# initializing list
test_list = [{4, 5, 6, 1}, {6, 4, 1, 5}, {1, 3, 4, 3},
             {1, 4, 3}, {7, 8, 9}]
 
# printing original list
print("The original list is : " + str(test_list))
 
# using dictionary to check for duplicates
hash_dict = {}
res = []
for s in test_list:
    if frozenset(s) in hash_dict:
        # set already exists in dictionary, so it is a duplicate
        res.append(s)
    else:
        # add set to dictionary
        hash_dict[frozenset(s)] = 1
 
# printing result
print("Duplicate sets list : " + str(res))

Output

The original list is : [{1, 4, 5, 6}, {1, 4, 5, 6}, {1, 3, 4}, {1, 3, 4}, {8, 9, 7}]
Duplicate sets list : [{1, 4, 5, 6}, {1, 3, 4}]

Time complexity: O(n), where n is the length of the input list.
Auxiliary space: O(n), where n is the length of the input list.

Suggest improvement

Python program to remove last element from set

Python set operations (union, intersection, difference and symmetric difference)

Share your thoughts in the comments

Python Program to Find Duplicate sets in list of sets

Method #1: Using Counter() + count() + frozenset() + loop

Python3

Method #2 : Using list comprehension + Counter()

Python3

Method #3 : Using nested loop

Python3

Method #4: Using set() and filter()

Python3

Method #5: Using defaultdict

Python3

Method #6: Using Hashing with Dictionary

Python3

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?