Python Program to Find Duplicate sets in list of sets

Last Updated : 18 Apr, 2023

Given a list of sets, the task is to write a Python program to find duplicate sets.

Input : test_list = [{4, 5, 6, 1}, {6, 4, 1, 5}, {1, 3, 4, 3}, {1, 4, 3}, {7, 8, 9}]
Output : [frozenset({1, 4, 5, 6}), frozenset({1, 3, 4})]
Explanation : {1, 4, 5, 6} is similar to {6, 4, 1, 5} hence part of result.

Input : test_list = [{4, 5, 6, 9}, {6, 4, 1, 5}, {1, 3, 4, 3}, {1, 4, 3}, {7, 8, 9}]
Output : [frozenset({1, 3, 4})]
Explanation : {1, 3, 4} ({1, 3, 4, 3}) is similar to {1, 4, 3} hence part of result.

Method #1: Using Counter() + count() + frozenset() + loop

In this, all the sets are hashed by converting them into frozenset() [ to get hashable type ] into frequency using Counter(). Then count() is used to get count of all the present sets from frequency counter created.

Step-by-step approach:

• Import the necessary modules: Counter from the collections module as this will be used to count the frequency of items in the list.
• Initialize a list test_list
• Each set has some integers in it.
• Print the original list using print().
• Use a list comprehension to create a new list freqs of frozen sets. Each frozen set is obtained by iterating through test_list, converting each set into a frozen set using frozenset().
• The Counter() function is then used to count the frequency of each frozen set.
• Initialize an empty list res.
• Iterate through the items in freqs using a for loop.
• For each item, extract the key and value.
• If the value is greater than 1, the key is a duplicate set, so it is appended to res.
• Print the final result using print().
• The program execution is complete.

Below is the implementation of the above approach:

Python3

 `# Python3 code to demonstrate working of` `# Duplicate sets in list of sets` `# Using Counter() + count() + frozenset() + loop` `from` `collections ``import` `Counter`   `# initializing list` `test_list ``=` `[{``4``, ``5``, ``6``, ``1``}, {``6``, ``4``, ``1``, ``5``}, {``1``, ``3``, ``4``, ``3``}, ` `             ``{``1``, ``4``, ``3``}, {``7``, ``8``, ``9``}]` `             `  `# printing original list` `print``(``"The original list is : "` `+` `str``(test_list))`   `# getting frequency using Counter()` `freqs ``=` `Counter(``frozenset``(sub) ``for` `sub ``in` `test_list)`   `res ``=` `[]` `for` `key, val ``in` `freqs.items():` `    `  `    ``# if frequency greater than 1, set is appended ` `    ``# [duplicate]` `    ``if` `val > ``1` `:` `        ``res.append(key)`   `# printing result` `print``(``"Duplicate sets list : "` `+` `str``(res))`

Output

```The original list is : [{1, 4, 5, 6}, {1, 4, 5, 6}, {1, 3, 4}, {1, 3, 4}, {8, 9, 7}]
Duplicate sets list : [frozenset({1, 4, 5, 6}), frozenset({1, 3, 4})]```

Time complexity: O(n), where n is the length of the input list.
Auxiliary space: O(n), where n is the length of the input list.

Method #2 : Using list comprehension + Counter()

In this, we perform similar task, only difference being list comprehension is used as one liner to extract duplicates based on frequency dictionary.

1. Import the Counter class from the collections module.
2. Initialize a list of sets called test_list with 5 sets containing different elements.
3. Print the original list.
4. Use a list comprehension to create a new list of frozen sets freqs that contains the frequency of each set in test_list.
*For each set sub in test_list, a frozen set is created and added to freqs using the Counter() function.
5. Use another list comprehension to create a new list res that contains only the sets from freqs that occur more than once.
*For each key, value pair in freqs, the key is added to res only if the value is greater than 1.
6. Print the final result, which is a list of all the sets in test_list that occur more than once.

Python3

 `# Python3 code to demonstrate working of` `# Duplicate sets in list of sets` `# Using list comprehension + Counter()` `from` `collections ``import` `Counter`   `# initializing list` `test_list ``=` `[{``4``, ``5``, ``6``, ``1``}, {``6``, ``4``, ``1``, ``5``}, {``1``, ``3``, ``4``, ``3``}, {``1``, ``4``, ``3``}, {``7``, ``8``, ``9``}]` `             `  `# printing original list` `print``(``"The original list is : "` `+` `str``(test_list))`   `# getting frequency using Counter()` `freqs ``=` `Counter(``frozenset``(sub) ``for` `sub ``in` `test_list)`   `# list comprehension provides shorthand solution` `res ``=` `[key ``for` `key, val ``in` `freqs.items() ``if` `val > ``1``]`   `# printing result` `print``(``"Duplicate sets list : "` `+` `str``(res))`

Output

```The original list is : [{1, 4, 5, 6}, {1, 4, 5, 6}, {1, 3, 4}, {1, 3, 4}, {8, 9, 7}]
Duplicate sets list : [frozenset({1, 4, 5, 6}), frozenset({1, 3, 4})]```

Time Complexity: O(n), where n is the length of the test_list.
Auxiliary Space: O(n), where n is the length of the test_list.

Python3

 `# initializing list` `test_list ``=` `[{``4``, ``5``, ``6``, ``1``}, {``6``, ``4``, ``1``, ``5``}, {``1``, ``3``, ``4``, ``3``}, ` `             ``{``1``, ``4``, ``3``}, {``7``, ``8``, ``9``}]` `# printing original list` `print``(``"The original list is : "` `+` `str``(test_list))` `# creating an empty set to store the result` `result ``=` `set``()` `# nested for loop to compare sets` `for` `i ``in` `range``(``len``(test_list)):` `    ``for` `j ``in` `range``(i``+``1``, ``len``(test_list)):` `        ``if` `test_list[i] ``=``=` `test_list[j]:` `            ``result.add(``frozenset``(test_list[i]))` `# printing result` `print``(``"Duplicate sets list : "` `+` `str``(``list``(result)))` `#This code is contributed by Jyothi pinjala`

Output

```The original list is : [{1, 4, 5, 6}, {1, 4, 5, 6}, {1, 3, 4}, {1, 3, 4}, {8, 9, 7}]
Duplicate sets list : [frozenset({1, 4, 5, 6}), frozenset({1, 3, 4})]```

Time Complexity: O(n^2)
Auxiliary Space: O(n)

Method #4: Using set() and filter()

This method uses the set() function to get a set of unique sets. The map() function is used to convert each set in test_list to a frozenset, which is hashable and can be used as a key in a set or dictionary. The filter() function is then used to filter out the unique sets and return only the duplicate sets. The lambda function checks if the count of the set in test_list is greater than 1. Finally, the list of duplicate sets is printe

Python3

 `# Python3 code to demonstrate working of` `# Duplicate sets in list of sets` `# Using set() and filter()`   `# initializing list` `test_list ``=` `[{``4``, ``5``, ``6``, ``1``}, {``6``, ``4``, ``1``, ``5``}, {``1``, ``3``, ``4``, ``3``}, ` `             ``{``1``, ``4``, ``3``}, {``7``, ``8``, ``9``}]` `             `  `# printing original list` `print``(``"The original list is : "` `+` `str``(test_list))`   `# using set() to get unique sets` `unique_sets ``=` `set``(``map``(``frozenset``, test_list))`   `# using filter() to get duplicate sets` `duplicate_sets ``=` `list``(``filter``(``lambda` `s: test_list.count(s) > ``1``, unique_sets))`   `# printing result` `print``(``"Duplicate sets list : "` `+` `str``(duplicate_sets))`

Output

```The original list is : [{1, 4, 5, 6}, {1, 4, 5, 6}, {1, 3, 4}, {1, 3, 4}, {8, 9, 7}]
Duplicate sets list : [frozenset({1, 4, 5, 6}), frozenset({1, 3, 4})]```

Time Complexity: The time complexity of this code is O(n^2), where n is the length of the input list test_list.
Auxiliary Space: The auxiliary space complexity of this code is O(n), where n is the length of the input list test_list.

Method #5: Using defaultdict

This approach uses a defaultdict to store the sets and their counts. Loop through the input list of sets and add each set to the defaultdict. The frozenset is used as the key since sets are not hashable, but frozensets are. Then loop through the defaultdict to get the sets with count greater than 1, indicating duplicates. The result is a list of duplicate sets.

Python3

 `from` `collections ``import` `defaultdict`   `test_list ``=` `[{``4``, ``5``, ``6``, ``1``}, {``6``, ``4``, ``1``, ``5``}, {``1``, ``3``, ``4``, ``3``}, {``1``, ``4``, ``3``}, {``7``, ``8``, ``9``}]`   `# create defaultdict to store sets with their count` `set_counts ``=` `defaultdict(``int``)` `for` `s ``in` `test_list:` `    ``set_counts[``frozenset``(s)] ``+``=` `1`   `# get sets with count greater than 1` `duplicate_sets ``=` `[``set` `for` `set``, count ``in` `set_counts.items() ``if` `count > ``1``]`   `# print result` `print``(``"Duplicate sets list : "` `+` `str``(duplicate_sets))`

Output

`Duplicate sets list : [frozenset({1, 4, 5, 6}), frozenset({1, 3, 4})]`

Time complexity: O(N*M), where N is the length of test_list and M is the size of the largest set in the list.
Auxiliary space: O(N*M), where N is the length of test_list and M is the size of the largest set in the list.

Method #6: Using Hashing with Dictionary

Use a dictionary to keep track of the sets that we have already encountered while iterating through the list. We will add the sets to the dictionary as keys and set their values to 1. If we come across a set that is already in the dictionary, we will append it to the result list.

Python3

 `# Python3 code to demonstrate working of` `# Duplicate sets in list of sets` `# Using Hashing with Dictionary`   `# initializing list` `test_list ``=` `[{``4``, ``5``, ``6``, ``1``}, {``6``, ``4``, ``1``, ``5``}, {``1``, ``3``, ``4``, ``3``},` `             ``{``1``, ``4``, ``3``}, {``7``, ``8``, ``9``}]`   `# printing original list` `print``(``"The original list is : "` `+` `str``(test_list))`   `# using dictionary to check for duplicates` `hash_dict ``=` `{}` `res ``=` `[]` `for` `s ``in` `test_list:` `    ``if` `frozenset``(s) ``in` `hash_dict:` `        ``# set already exists in dictionary, so it is a duplicate` `        ``res.append(s)` `    ``else``:` `        ``# add set to dictionary` `        ``hash_dict[``frozenset``(s)] ``=` `1`   `# printing result` `print``(``"Duplicate sets list : "` `+` `str``(res))`

Output

```The original list is : [{1, 4, 5, 6}, {1, 4, 5, 6}, {1, 3, 4}, {1, 3, 4}, {8, 9, 7}]
Duplicate sets list : [{1, 4, 5, 6}, {1, 3, 4}]```

Time complexity: O(n), where n is the length of the input list.
Auxiliary space: O(n), where n is the length of the input list.

Previous
Next