Open In App

Python Program to Find Duplicate sets in list of sets

Given a list of sets, the task is to write a Python program to find duplicate sets.

Input : test_list = [{4, 5, 6, 1}, {6, 4, 1, 5}, {1, 3, 4, 3}, {1, 4, 3}, {7, 8, 9}]
Output : [frozenset({1, 4, 5, 6}), frozenset({1, 3, 4})]
Explanation : {1, 4, 5, 6} is similar to {6, 4, 1, 5} hence part of result.

Input : test_list = [{4, 5, 6, 9}, {6, 4, 1, 5}, {1, 3, 4, 3}, {1, 4, 3}, {7, 8, 9}]
Output : [frozenset({1, 3, 4})]
Explanation : {1, 3, 4} ({1, 3, 4, 3}) is similar to {1, 4, 3} hence part of result.

Method #1: Using Counter() + count() + frozenset() + loop

In this, all the sets are hashed by converting them into frozenset() [ to get hashable type ] into frequency using Counter(). Then count() is used to get count of all the present sets from frequency counter created. 

Step-by-step approach:

Below is the implementation of the above approach:

# Python3 code to demonstrate working of
# Duplicate sets in list of sets
# Using Counter() + count() + frozenset() + loop
from collections import Counter
# initializing list
test_list = [{4, 5, 6, 1}, {6, 4, 1, 5}, {1, 3, 4, 3},
             {1, 4, 3}, {7, 8, 9}]
# printing original list
print("The original list is : " + str(test_list))
# getting frequency using Counter()
freqs = Counter(frozenset(sub) for sub in test_list)
res = []
for key, val in freqs.items():
    # if frequency greater than 1, set is appended
    # [duplicate]
    if val > 1 :
# printing result
print("Duplicate sets list : " + str(res))

The original list is : [{1, 4, 5, 6}, {1, 4, 5, 6}, {1, 3, 4}, {1, 3, 4}, {8, 9, 7}]
Duplicate sets list : [frozenset({1, 4, 5, 6}), frozenset({1, 3, 4})]

Time complexity: O(n), where n is the length of the input list. 
Auxiliary space: O(n), where n is the length of the input list. 

Method #2 : Using list comprehension + Counter()

In this, we perform similar task, only difference being list comprehension is used as one liner to extract duplicates based on frequency dictionary.

  1. Import the Counter class from the collections module.
  2. Initialize a list of sets called test_list with 5 sets containing different elements.
  3. Print the original list.
  4. Use a list comprehension to create a new list of frozen sets freqs that contains the frequency of each set in test_list.
    *For each set sub in test_list, a frozen set is created and added to freqs using the Counter() function.
  5. Use another list comprehension to create a new list res that contains only the sets from freqs that occur more than once.
    *For each key, value pair in freqs, the key is added to res only if the value is greater than 1.
  6. Print the final result, which is a list of all the sets in test_list that occur more than once.

# Python3 code to demonstrate working of
# Duplicate sets in list of sets
# Using list comprehension + Counter()
from collections import Counter
# initializing list
test_list = [{4, 5, 6, 1}, {6, 4, 1, 5}, {1, 3, 4, 3}, {1, 4, 3}, {7, 8, 9}]
# printing original list
print("The original list is : " + str(test_list))
# getting frequency using Counter()
freqs = Counter(frozenset(sub) for sub in test_list)
# list comprehension provides shorthand solution
res = [key for key, val in freqs.items() if val > 1]
# printing result
print("Duplicate sets list : " + str(res))

The original list is : [{1, 4, 5, 6}, {1, 4, 5, 6}, {1, 3, 4}, {1, 3, 4}, {8, 9, 7}]
Duplicate sets list : [frozenset({1, 4, 5, 6}), frozenset({1, 3, 4})]

Time Complexity: O(n), where n is the length of the test_list.
Auxiliary Space: O(n), where n is the length of the test_list.

Method #3 : Using nested loop

# initializing list
test_list = [{4, 5, 6, 1}, {6, 4, 1, 5}, {1, 3, 4, 3},
             {1, 4, 3}, {7, 8, 9}]
# printing original list
print("The original list is : " + str(test_list))
# creating an empty set to store the result
result = set()
# nested for loop to compare sets
for i in range(len(test_list)):
    for j in range(i+1, len(test_list)):
        if test_list[i] == test_list[j]:
# printing result
print("Duplicate sets list : " + str(list(result)))
#This code is contributed by Jyothi pinjala

The original list is : [{1, 4, 5, 6}, {1, 4, 5, 6}, {1, 3, 4}, {1, 3, 4}, {8, 9, 7}]
Duplicate sets list : [frozenset({1, 4, 5, 6}), frozenset({1, 3, 4})]

Time Complexity: O(n^2)
Auxiliary Space: O(n)

Method #4: Using set() and filter()

This method uses the set() function to get a set of unique sets. The map() function is used to convert each set in test_list to a frozenset, which is hashable and can be used as a key in a set or dictionary. The filter() function is then used to filter out the unique sets and return only the duplicate sets. The lambda function checks if the count of the set in test_list is greater than 1. Finally, the list of duplicate sets is printe

# Python3 code to demonstrate working of
# Duplicate sets in list of sets
# Using set() and filter()
# initializing list
test_list = [{4, 5, 6, 1}, {6, 4, 1, 5}, {1, 3, 4, 3},
             {1, 4, 3}, {7, 8, 9}]
# printing original list
print("The original list is : " + str(test_list))
# using set() to get unique sets
unique_sets = set(map(frozenset, test_list))
# using filter() to get duplicate sets
duplicate_sets = list(filter(lambda s: test_list.count(s) > 1, unique_sets))
# printing result
print("Duplicate sets list : " + str(duplicate_sets))

The original list is : [{1, 4, 5, 6}, {1, 4, 5, 6}, {1, 3, 4}, {1, 3, 4}, {8, 9, 7}]
Duplicate sets list : [frozenset({1, 4, 5, 6}), frozenset({1, 3, 4})]

Time Complexity: The time complexity of this code is O(n^2), where n is the length of the input list test_list. 
Auxiliary Space: The auxiliary space complexity of this code is O(n), where n is the length of the input list test_list. 

Method #5: Using defaultdict

This approach uses a defaultdict to store the sets and their counts. Loop through the input list of sets and add each set to the defaultdict. The frozenset is used as the key since sets are not hashable, but frozensets are. Then loop through the defaultdict to get the sets with count greater than 1, indicating duplicates. The result is a list of duplicate sets.

from collections import defaultdict
test_list = [{4, 5, 6, 1}, {6, 4, 1, 5}, {1, 3, 4, 3}, {1, 4, 3}, {7, 8, 9}]
# create defaultdict to store sets with their count
set_counts = defaultdict(int)
for s in test_list:
    set_counts[frozenset(s)] += 1
# get sets with count greater than 1
duplicate_sets = [set for set, count in set_counts.items() if count > 1]
# print result
print("Duplicate sets list : " + str(duplicate_sets))

Duplicate sets list : [frozenset({1, 4, 5, 6}), frozenset({1, 3, 4})]

Time complexity: O(N*M), where N is the length of test_list and M is the size of the largest set in the list.
Auxiliary space: O(N*M), where N is the length of test_list and M is the size of the largest set in the list.

Method #6: Using Hashing with Dictionary

Use a dictionary to keep track of the sets that we have already encountered while iterating through the list. We will add the sets to the dictionary as keys and set their values to 1. If we come across a set that is already in the dictionary, we will append it to the result list.

# Python3 code to demonstrate working of
# Duplicate sets in list of sets
# Using Hashing with Dictionary
# initializing list
test_list = [{4, 5, 6, 1}, {6, 4, 1, 5}, {1, 3, 4, 3},
             {1, 4, 3}, {7, 8, 9}]
# printing original list
print("The original list is : " + str(test_list))
# using dictionary to check for duplicates
hash_dict = {}
res = []
for s in test_list:
    if frozenset(s) in hash_dict:
        # set already exists in dictionary, so it is a duplicate
        # add set to dictionary
        hash_dict[frozenset(s)] = 1
# printing result
print("Duplicate sets list : " + str(res))

The original list is : [{1, 4, 5, 6}, {1, 4, 5, 6}, {1, 3, 4}, {1, 3, 4}, {8, 9, 7}]
Duplicate sets list : [{1, 4, 5, 6}, {1, 3, 4}]

Time complexity: O(n), where n is the length of the input list.
Auxiliary space: O(n), where n is the length of the input list.

Article Tags :