Python | Difference of two lists including duplicates

Last Updated : 04 May, 2023

The ways to find difference of two lists has been discussed earlier, but sometimes, we require to remove only the specific occurrences of the elements as much they occur in other list. Let’s discuss certain ways in which this can be performed.

Method #1 : Using collections.Counter() The Counter method can be used to get the exact occurrence of the elements in the list and hence can subtract selectively rather than using the set and ignoring the count of elements altogether. Then the subtraction can be performed to get the actual occurrence.

Python3

# Python3 code to demonstrate
# Difference of list including duplicates
# Using collections.Counter()
from collections import Counter
 
# initializing lists
test_list1 = [1, 3, 4, 5, 1, 3, 3]
test_list2 = [1, 3, 5]
 
# printing original lists
print("The original list 1 : " + str(test_list1))
print("The original list 2 : " + str(test_list2))
 
# Using collections.Counter()
# Difference of list including duplicates
res = list((Counter(test_list1) - Counter(test_list2)).elements())
 
# print result
print("The list after performing the subtraction : " + str(res))

Output :

The original list 1 : [1, 3, 4, 5, 1, 3, 3]
The original list 2 : [1, 3, 5]
The list after performing the subtraction : [1, 3, 3, 4]

Time complexity: O(n)
Auxiliary space: O(n), where n is the total number of elements in both lists.

Method #2: Using map() + lambda + remove() The combination of above functions can be used to perform this particular task. The map function can be used to link the function to all elements and remove the first occurrence of it. Hence doesn’t remove repeatedly. Works with Python2 only.

Python

# Python code to demonstrate
# Difference of list including duplicates
# Using map() + lambda + remove()
 
# initializing lists
test_list1 = [1, 3, 4, 5, 1, 3, 3]
test_list2 = [1, 3, 5]
 
# printing original lists
print("The original list 1 : " + str(test_list1))
print("The original list 2 : " + str(test_list2))
 
# Using map() + lambda + remove()
# Difference of list including duplicates
res = map(lambda x: test_list1.remove(x) 
          if x in test_list1 else None, test_list2)
 
# print result
print("The list after performing the subtraction : " + str(test_list1))

Output :

The original list 1 : [1, 3, 4, 5, 1, 3, 3]
The original list 2 : [1, 3, 5]
The list after performing the subtraction : [1, 3, 3, 4]

Time Complexity: O(n*n), where n is the length of the list test_list
Auxiliary Space: O(n) additional space of size n is created where n is the number of elements in the list

Method#3: Using loop

Step-by-Step Algorithm:

Initialize two lists, test_list1 and test_list2
Create a copy of test_list1 and assign it to the variable res.
Iterate through each element in test_list2.
Check if the element exists in res, if it does, remove it from res using the remove() method.
Sort the res list in ascending order.
Print the res list as the result.

Python3

# initializing lists
test_list1 = [1, 3, 4, 5, 1, 3, 3]
test_list2 = [1, 3, 5]
 
# compute the difference between test_list1 and test_list2
res = test_list1.copy()
for elem in test_list2:
    if elem in res:
        res.remove(elem)
res.sort()
# print the result
print("The list after performing the subtraction : " + str(res))

Output

The list after performing the subtraction : [1, 3, 3, 4]

Time Complexity:
The time complexity of this algorithm is O(n * m), where n is the length of test_list1 and m is the length of test_list2. This is because we need to iterate through each element in test_list2 and perform an operation (remove) that could take up to O(n) time in the worst case.

Auxiliary Space Complexity:
The auxiliary space complexity of this algorithm is O(n), where n is the length of test_list1. This is because we are creating a copy of test_list1 and assigning it to the res variable, which takes O(n) space. Additionally, we may need to remove up to n elements from the res list, which would also take up to O(n) space in the worst case.

Method#4: Using heapq:

Algorithm:

Copy the elements of the first list to a new list.
Iterate through each element of the second list.
If an element of the second list exists in the new list, remove the first occurrence of that element from the new list.
Sort the new list in ascending order.
Print the new list as the output.

Python3

import heapq
 
# initializing lists
test_list1 = [1, 3, 4, 5, 1, 3, 3]
test_list2 = [1, 3, 5]
 
# Using heapq method to find the difference
# between two lists
heapq.heapify(test_list1)
heapq.heapify(test_list2)
 
res = []
 
while test_list1:
   
    if not test_list2 or test_list1[0] < test_list2[0]:
        res.append(heapq.heappop(test_list1))
    elif test_list1[0] == test_list2[0]:
        heapq.heappop(test_list1)
        heapq.heappop(test_list2)
    else:
        heapq.heappop(test_list2)
 
# print the result
print("The list after performing the subtraction : " + str(res))

Output

The list after performing the subtraction : [1, 3, 3, 4]

Time Complexity:

Converting the two lists to heaps using heapify takes O(n) time where n is the length of the list.
The while loop iterates until the test_list1 is empty which takes O(n) time at most where n is the length of the list.
The heappop function takes O(log n) time for each pop operation where n is the size of the heap.
Overall, the time complexity of the algorithm is O(n log n) where n is the length of the list.

Space Complexity:

The space complexity of the algorithm is O(n) where n is the length of the list. This is because we are creating a new list res to store the elements of the difference between the two lists. The space complexity of heapify and heappop operations are O(1) in this case.

Suggest improvement

Python - Difference of List keeping duplicates

Share your thoughts in the comments