Python program to Remove Duplicity from a Dictionary

Last Updated : 03 Jun, 2023

Given a dictionary with value as lists, the task is to write a Python program that can remove contents of the dictionary irrespective of the fact that they are keys or values that occur more than once.

Input : {‘gfg’ : [‘gfg’, ‘is’, ‘best’], ‘best’ : [‘gfg’], ‘apple’ : [‘good’]}

Output : {‘gfg’: [‘gfg’, ‘is’, ‘best’], ‘apple’: [‘good’]}

Explanation : best key is omitted as it is already as value of 1st key.

Input : {‘gfg’ : [‘gfg’, ‘is’, ‘best’, ‘apple’], ‘best’ : [‘gfg’], ‘apple’ : [‘good’]}

Output : {‘gfg’: [‘gfg’, ‘is’, ‘best’, ‘apple’]}

Explanation : best and apple keys are omitted as it is already as value of 1st key.

Method 1: Using loop and items()

In this, we iterate for each key, and its value is extracted using items(), and memorizing them, keys and values are omitted to be added/created in case they have already occurred.

Python3

# initializing dictionary
test_dict = {'gfg': ['gfg', 'is', 'best'],
             'best': ['gfg'],
             'apple': ['good']}
 
# printing original dictionary
print("The original dictionary is : " + str(test_dict))
 
res = dict()
for key, val in test_dict.items():
 
    flag = True
    for key1, val1 in res.items():
 
        # filtering value from memorised values
        if key in val1:
            flag = False
    if flag:
        res[key] = val
 
# printing result
print("The filtered dictionary : " + str(res))

Output:

The original dictionary is : {‘gfg’: [‘gfg’, ‘is’, ‘best’], ‘best’: [‘gfg’], ‘apple’: [‘good’]}

The filtered dictionary : {‘gfg’: [‘gfg’, ‘is’, ‘best’], ‘apple’: [‘good’]}

Method #2: Using a Set and a New Dictionary

Approach

we first initialize an empty set to keep track of values that we have already seen. Then, we initialize an empty dictionary to store non-duplicate values. We loop through each key-value pair in the original dictionary and for each key-value pair, we get the list of values associated with that key. We initialize an empty list to store non-duplicate values for that key, and loop through each value in the list. For each value, we check if it is in the set of seen values. If it is not, we add it to the set of seen values and append it to the list of non-duplicate values. If the list of non-duplicate values is not empty, we add the key-value pair to the new dictionary. Finally, we return the new dictionary.

Algorithm

1. Initialize an empty set to keep track of values already seen.
2. Initialize an empty dictionary to store non-duplicate values.
3. Loop through each key-value pair in the original dictionary:
a. Get the list of values for the current key.
b. Initialize an empty list to store non-duplicate values.
c. Loop through each value in the list:
i. If the value has not been seen before, add it to the set and append it to the list of non-duplicate values.
d. If the list of non-duplicate values is not empty, add the key-value pair to the new dictionary.
4. Return the new dictionary.

Python3

def remove_duplicates(input_dict):
    seen_values = set()
    new_dict = {}
    for key, values in input_dict.items():
        non_duplicate_values = []
        for value in values:
            if value not in seen_values:
                seen_values.add(value)
                non_duplicate_values.append(value)
        if non_duplicate_values:
            new_dict[key] = non_duplicate_values
    return new_dict
input_dict = {'gfg': ['gfg', 'is', 'best'],
             'best': ['gfg'],
             'apple': ['good']}
print(remove_duplicates(input_dict))

Output

{'gfg': ['gfg', 'is', 'best'], 'apple': ['good']}

Time Complexity: O(nm), where n is the number of key-value pairs in the original dictionary and m is the maximum length of the value lists.
Auxiliary space: O(nm), where n is the number of key-value pairs in the original dictionary and m is the maximum length of the value lists.

Method 3: Use list comprehension with a condition to filter out the desired key-value pairs.

Initialize a dictionary named test_dict with some key-value pairs where the value is a list of strings.
Print the original dictionary test_dict.
Create a filtered dictionary filtered_dict using a list comprehension with a condition to filter the dictionary based on values.
The list comprehension iterates over each key-value pair of test_dict using the items() method.
Inside the list comprehension, it checks whether any of the other values in test_dict are a subset of the current value using the set() and <= operators.
If a subset is found, then that key-value pair is filtered out of the filtered_dict.
Print the filtered dictionary filtered_dict.

Python3

# Initializing dictionary
test_dict = {'gfg': ['gfg', 'is', 'best'],
             'best': ['gfg'],
             'apple': ['good']}
 
# Printing original dictionary
print("The original dictionary is : " + str(test_dict))
 
# Using list comprehension with a condition to filter the dictionary based on values
filtered_dict = {k:v for k, v in test_dict.items() if not any(set(v) <= set(val) for val in test_dict.values() if val != v)}
 
# Printing the filtered dictionary
print("The filtered dictionary : " + str(filtered_dict))

Output

The original dictionary is : {'gfg': ['gfg', 'is', 'best'], 'best': ['gfg'], 'apple': ['good']}
The filtered dictionary : {'gfg': ['gfg', 'is', 'best'], 'apple': ['good']}

Time complexity: O(n^2) because it requires iterating over the dictionary twice, once for the keys and once for the values.
Auxiliary space: O(n) because we are creating a new dictionary to store the filtered key-value pairs.

Method 4 : using the filter() function along with a lambda function.

Here filter() with a lambda function to filter the dictionary based on values.

Python3

# Initializing dictionary
test_dict = {'gfg': ['gfg', 'is', 'best'],
             'best': ['gfg'],
             'apple': ['good']}
 
# Printing original dictionary
print("The original dictionary is: " + str(test_dict))
 
filtered_dict = dict(filter(lambda x: not any(
  set(x[1]) <= set(val) for val in test_dict.values() if val != x[1]), 
                            test_dict.items()))
 
# Printing the filtered dictionary
print("The filtered dictionary: " + str(filtered_dict))

Output

The original dictionary is: {'gfg': ['gfg', 'is', 'best'], 'best': ['gfg'], 'apple': ['good']}
The filtered dictionary: {'gfg': ['gfg', 'is', 'best'], 'apple': ['good']}

Time Complexity: The time complexity of this approach is O(n^2), where n is the number of key-value pairs in the dictionary.

Auxiliary Space: The auxiliary space complexity is O(n), where n is the number of key-value pairs in the dictionary.

Suggest improvement

Python - Extracting Kth Key in Dictionary

Scatter Plot with Regression Line using Altair in Python

Share your thoughts in the comments

Python program to Remove Duplicity from a Dictionary

Python3

Method #2: Using a Set and a New Dictionary

Approach

Algorithm

Python3

Python3

Python3

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?