Sometimes, while working with records, we can have a problem of extracting those records which occur more than once. This kind of application can occur in web development domain. Let’s discuss certain ways in which this task can be performed.
Method #1 : Using list comprehension + set() + count() Initial approach that can be applied is that we can iterate on each tuple and check it’s count in list using count(), if greater than one, we can add to list. To remove multiple additions, we can convert the result to set using set().
Python3
test_list = [( 3 , 4 ), ( 4 , 5 ), ( 3 , 4 ),
( 3 , 4 ), ( 4 , 5 ), ( 6 , 7 )]
print ( "The original list : " + str (test_list))
res = list ( set ([ele for ele in test_list
if test_list.count(ele) > 1 ]))
print ( "All the duplicates from list are : " + str (res))
|
Output
The original list : [(3, 4), (4, 5), (3, 4), (3, 4), (4, 5), (6, 7)]
All the duplicates from list are : [(4, 5), (3, 4)]
Time Complexity: O(n^2)
Auxiliary Space: O(n)
Method #2 : Using Counter() + items() + list comprehension The combination of above functions can also be used to perform this particular task. In this, we just get the count of each occurrence of element using Counter() as dictionary and then extract all those whose value is above 1.
Python3
from collections import Counter
test_list = [( 3 , 4 ), ( 4 , 5 ), ( 3 , 4 ),
( 3 , 4 ), ( 4 , 5 ), ( 6 , 7 )]
print ( "The original list : " + str (test_list))
res = [ele for ele, count in Counter(test_list).items()
if count > 1 ]
print ( "All the duplicates from list are : " + str (res))
|
Output :
The original list : [(3, 4), (4, 5), (3, 4), (3, 4), (4, 5), (6, 7)]
All the duplicates from list are : [(4, 5), (3, 4)]
Time Complexity: O(n)
Auxiliary Space: O(n)
Method#3: Using dictionary. we can iterate through the list of tuples, and for each tuple, check if it exists in the dictionary. If it does, it means the tuple is a duplicate, and you can add it to a separate list of duplicate tuples. If it doesn’t, you can add it to the dictionary as a key with a value of 1.
Python3
test_list = [( 3 , 4 ), ( 4 , 5 ), ( 3 , 4 ), ( 4 , 5 ), ( 6 , 7 )]
print ( 'The original list :' + str (test_list))
d = {}
duplicates = []
for tup in test_list:
if tup in d:
duplicates.append(tup)
else :
d[tup] = 1
print ( 'All the duplicates from list are :' + str (duplicates))
|
Output
The original list :[(3, 4), (4, 5), (3, 4), (4, 5), (6, 7)]
All the duplicates from list are :[(3, 4), (4, 5)]
Time Complexity: O(n), where n in the number of elements in list
Auxiliary Space: O(n), as it creates a dictionary with n elements.
Method#4: Using for loop
Python3
test_list = [( 3 , 4 ), ( 4 , 5 ), ( 3 , 4 ),
( 3 , 4 ), ( 4 , 5 ), ( 6 , 7 )]
print ( "The original list : " + str (test_list))
res = []
for i in range ( len (test_list)):
for j in range (i + 1 , len (test_list)):
if test_list[i] = = test_list[j] and test_list[i] not in res:
res.append(test_list[i])
print ( "All the duplicates from list are : " + str (res))
|
Output
The original list : [(3, 4), (4, 5), (3, 4), (3, 4), (4, 5), (6, 7)]
All the duplicates from list are : [(3, 4), (4, 5)]
Time Complexity: O(n)
Auxiliary Space: O(n)
Method#5: Using the groupby function from the itertools :
This code uses a tool called groupby() to group similar things in a list of tuples together. It sorts the list of tuples so that duplicates are next to each other, and then groups them by the values in the tuples.
The code then looks for groups with more than one element (i.e., duplicates), and keeps track of the tuples that appear more than once in a new list. Finally, it prints the list of duplicate tuples.
In essence, the code sorts the input list of tuples, groups them by their values, filters the groups to include only duplicates, and outputs the duplicates in a new list.
Python3
from itertools import groupby
test_list = [( 3 , 4 ), ( 4 , 5 ), ( 3 , 4 ), ( 3 , 4 ), ( 4 , 5 ), ( 6 , 7 )]
print ( "The original list : " + str (test_list))
test_list.sort()
grouped = groupby(test_list)
duplicates = [key for key, group in grouped if len ( list (group)) > 1 ]
print (duplicates)
|
Output
The original list : [(3, 4), (4, 5), (3, 4), (3, 4), (4, 5), (6, 7)]
[(3, 4), (4, 5)]
Time Complexity: O(n log n)
Auxiliary Space: O(n)Method#5
Method#6: Using defaultdict
Algorithm:
- Initialize an empty dictionary freq_dict to store the frequency of each tuple in the list.
- Iterate over each tuple tpl in the test_list.
- Append the tuple to the list corresponding to its frequency in freq_dict.
- Get a list of tuples with a frequency greater than 1.
- Flatten the list of tuples to get the duplicate tuples.
- Remove duplicates from the list of duplicate tuples.
- Return the list of duplicate tuples.
Python3
from collections import defaultdict
test_list = [( 3 , 4 ), ( 4 , 5 ), ( 3 , 4 ), ( 3 , 4 ), ( 4 , 5 ), ( 6 , 7 )]
print ( "The original list : " + str (test_list))
freq_dict = defaultdict( list )
for tpl in test_list:
freq_dict[tpl].append(tpl)
res = [tpls for tpls in freq_dict.values() if len (tpls) > 1 ]
res = [tpl for tpls in res for tpl in tpls]
res = list ( set (res))
res.sort()
print ( "All the duplicates from list are : " + str (res))
|
Output
The original list : [(3, 4), (4, 5), (3, 4), (3, 4), (4, 5), (6, 7)]
All the duplicates from list are : [(3, 4), (4, 5)]
Time Complexity:
The time complexity of this approach is O(n), where n is the length of the input list. This is because we need to iterate over each tuple in the list once to create the frequency dictionary, and then we need to iterate over the list of tuples with a frequency greater than 1 to get the duplicate tuples.
Auxiliary Space:
The auxiliary space complexity of this approach is O(n), where n is the length of the input list. This is because we need to create a dictionary with a key for each unique tuple in the input list, which could be as large as the length of the input list. However, the actual space used by the dictionary will depend on the number of unique tuples in the input list, so the space complexity could be less than O(n) in practice.
Share your thoughts in the comments
Please Login to comment...