Python – Remove Duplicate subset Tuples

• Last Updated : 13 Mar, 2023

Sometimes, while working with Python tuples, we can have a problem in which we need to perform the removal of tuples, which are already present as subsets in other tuples. This kind of problem can be useful in data preprocessing. Let’s discuss certain ways in which this task can be performed.

Example:

Input : test_list = [(6, 9, 17, 18), (15, 34, 56), (6, 7, 10), (6, 7), (6, 9), (15, 34)], K = 2
Output : [(6, 9, 17, 18), (15, 34, 56), (6, 7, 10)]

Input : test_list = [(6, 9, 17, 18), (15, 34, 56), (6, 7, 10)], K = 2
Output : [(6, 9, 17, 18), (15, 34, 56), (6, 7, 10)]

Method #1 : Using setdefault() + list comprehension
This is one of the ways in which this task can be solved. In this, we perform the task of initializing the list and keeping elements to compare. At last, list comprehension is used to perform the removal of subset tuples. This method gives the flexibility of the size of tuples for removal.

Python3

 `# Python3 code to demonstrate working of``# Remove Duplicate subset Tuples``# Using setdefault() + list comprehension` `# initializing lists``test_list ``=` `[(``6``, ``9``, ``17``, ``18``), (``15``, ``34``, ``56``), (``6``, ``7``), (``6``, ``9``), (``15``, ``34``)]` `# printing original list``print``("The original ``list` `is` `: " ``+` `str``(test_list))` `# initializing K``K ``=` `2` `# Remove Duplicate subset Tuples``# Using setdefault() + list comprehension``temp ``=` `{}``for` `sub ``in` `test_list:``    ``temp2 ``=` `sub[:K]``    ``temp.setdefault(temp2, []).append(sub)``res ``=` `[sub ``for` `sub ``in` `test_list ``if` `len``(sub) > K ``or` `len``(temp[sub]) ``=``=` `1``]` `# printing result``print``("``Tuple` `list` `after removal : " ``+` `str``(res))`

Output :

```The original list is : [(6, 9, 17, 18), (15, 34, 56), (6, 7), (6, 9), (15, 34)]
Tuple list after removal : [(6, 9, 17, 18), (15, 34, 56), (6, 7)]```

Time complexity: O(n), where n is the length of the input list.
Auxiliary space: O(n), for the dictionary used to store subsets of tuples.

Method #2 : Using all() + any() + loop
The combination of the above functions provides yet another way to solve this problem. In this, we test for all subsets, irrespective of size. The any() function is used to check if any of the tuples is new in all elements of a particular tuple extracted using all().

Python3

 `# Python3 code to demonstrate working of``# Remove Duplicate subset Tuples``# Using all() + any()+ loop` `# initializing lists``test_list ``=` `[(``6``, ``9``, ``17``, ``18``), (``15``, ``34``, ``56``), (``6``, ``7``), (``6``, ``9``), (``15``, ``34``)]` `# printing original list``print``("The original ``list` `is` `: " ``+` `str``(test_list))` `# Remove Duplicate subset Tuples``# Using all() + any() + loop``res ``=` `[]``test_list ``=` `sorted``(test_list, key ``=` `lambda` `x: ``len``(x))``for` `idx, sub ``in` `enumerate``(test_list):``    ``if` `any``(``all``(ele ``in` `sub2 ``for` `ele ``in` `sub) ``for` `sub2 ``in` `test_list[idx ``+` `1``:]):``        ``pass``    ``else``:``        ``res.append(sub)` `# printing result``print``("``Tuple` `list` `after removal : " ``+` `str``(res))`

Output :

```The original list is : [(6, 9, 17, 18), (15, 34, 56), (6, 7), (6, 9), (15, 34)]
Tuple list after removal : [(6, 9, 17, 18), (15, 34, 56), (6, 7)]```

Time complexity: O(n^2 log n), where n is the length of the input list test_list.
Auxiliary Space: O(n), where n is the length of the input list test_list.

Method 3: Using the set() method

We can convert each tuple in the list into a set, and then store them in a separate list. This would eliminate the duplicate tuples because sets only store unique elements. We can then convert each set back to a tuple and store them in the final result list

Python3

 `test_list ``=` `[(``6``, ``9``, ``17``, ``18``), (``15``, ``34``, ``56``), (``6``, ``7``), (``6``, ``9``), (``15``, ``34``)]``res ``=` `[]``seen ``=` `set``()` `for` `tup ``in` `test_list:``    ``if` `not` `any``(``set``(tup) <``=` `s ``for` `s ``in` `seen):``        ``res.append(tup)``        ``seen.add(``frozenset``(tup))` `print``(``"Tuple list after removal : "` `+` `str``(res))`

Output

`Tuple list after removal : [(6, 9, 17, 18), (15, 34, 56), (6, 7)]`

Time complexity: O(n^2) because we are iterating through the input list twice.
Auxiliary space: O(n) because we are using a separate set to store unique sets.

Method #4: Using the dict.fromkeys() method

In this approach, we first create a dictionary using the dict.fromkeys() method and pass the list as a parameter. This method creates a dictionary with keys from the list and sets their values to None. Since dictionaries don’t allow duplicate keys, this effectively removes the duplicates from the list.

Python3

 `# Python3 code to demonstrate working of``# Remove Duplicate subset Tuples``# Using dict.fromkeys() method` `# initializing lists``test_list ``=` `[(``6``, ``9``, ``17``, ``18``), (``15``, ``34``, ``56``), (``6``, ``7``), (``6``, ``9``), (``15``, ``34``)]` `# printing original list``print``(``"The original list is : "` `+` `str``(test_list))` `# Remove Duplicate subset Tuples``# Using dict.fromkeys() method``res ``=` `list``(``dict``.fromkeys(test_list))` `# printing result``print``(``"Tuple list after removal : "` `+` `str``(res))`

Output

```The original list is : [(6, 9, 17, 18), (15, 34, 56), (6, 7), (6, 9), (15, 34)]
Tuple list after removal : [(6, 9, 17, 18), (15, 34, 56), (6, 7), (6, 9), (15, 34)]```

Time complexity: O(n), where n is the size of the list
Auxiliary space: O(n), where n is the size of the list

Method #6: Using itertools.groupby() function

In the above code, we use the groupby() function from the itertools module to group the list of tuples by their values. We then sort the list of tuples and iterate over the groups, keeping only the first tuple from each group (since all tuples in a group are duplicates). Finally, we return the list of unique tuples.

Python3

 `# Python3 code to demonstrate working of``# Remove Duplicate subset Tuples``# Using itertools.groupby() function` `# import groupby from itertools module``from` `itertools ``import` `groupby` `# initializing lists``test_list ``=` `[(``6``, ``9``, ``17``, ``18``), (``15``, ``34``, ``56``), (``6``, ``7``), (``6``, ``9``), (``15``, ``34``)]` `# printing original list``print``(``"The original list is : "` `+` `str``(test_list))` `# Remove Duplicate subset Tuples``# Using itertools.groupby() function``res ``=` `[``next``(group) ``for` `_, group ``in` `groupby(``sorted``(test_list))]` `# printing result``print``(``"Tuple list after removal : "` `+`

Output:

```The original list is : [(6, 9, 17, 18), (15, 34, 56), (6, 7), (6, 9), (15, 34)]
Tuple list after removal : [(6, 7), (6, 9), (6, 9, 17, 18), (15, 34), (15, 34, 56)]```

Time complexity: O(nlogn) because of the sorting operation used before applying the groupby function.
Auxiliary space: O(n), where n is the size of the input list.

My Personal Notes arrow_drop_up