Python | Find top K frequent elements from a list of tuples
Last Updated :
09 May, 2023
Given a list of tuples with word as first element and its frequency as second element, the task is to find top k frequent element. Below are some ways to above achieve the above task.
Method #1: Using defaultdict
Python3
import collections
from operator import itemgetter
from itertools import chain
Input = [[( 'Name' , 151 )], [( 'ACe' , 400 )],
[( 'TURN' , 210 )], [( 'RED' , 1113 )],
[( 'YELLOW' , 1 )]]
K = 3
dict_ = collections.defaultdict( list )
new_list = list (chain.from_iterable( Input ))
for elem in new_list:
dict_[elem[ 0 ]].append(elem[ 1 ])
res = {k: sum (v) for k, v in dict_.items()}
Output = sorted (res.items(), key = itemgetter( 1 ),
reverse = True )[ 0 :K]
print ("Initial List of tuple is ", Input )
print ("\nTop 'K' elements are", Output)
|
The time complexity is O(n log n), where ‘n’ is the total number of elements in the input list.
The auxiliary space complexity of the given code is O(n), where ‘n’ is the total number of elements in the input list.
Method #2: Using itertools and sorted
Python3
from operator import itemgetter
from itertools import chain
Input = [[( 'Name' , 151 )], [( 'ACe' , 400 )],
[( 'TURN' , 210 )], [( 'RED' , 1113 )],
[( 'YELLOW' , 1 )]]
K = 3
Output = sorted ( list (chain.from_iterable( Input )),
key = itemgetter( 1 ), reverse = True )[ 0 :K]
print ("Initial List of tuple is ", Input )
print ("\nTop 'K' elements are", Output)
|
The time complexity is O(nlogn), where ‘n’ is the total number of elements in the Input list.
The auxiliary space is O(n), where ‘n’ is the total number of elements in the Input list.
Method #3: Using list comprehension + lambda
To find the top K frequent elements from the given list of tuples , you can use a list comprehension to flatten the list of tuples.
Python3
list_of_tuples = [[( 'Name' , 151 )], [( 'ACe' , 400 )],
[( 'TURN' , 210 )], [( 'RED' , 1113 )],
[( 'YELLOW' , 1 )]]
K = 3
flattened_list = [elem for sublist in list_of_tuples for elem in sublist]
sorted_list = sorted (flattened_list, key = lambda x: x[ 1 ], reverse = True )
top_k = sorted_list[:K]
print ( "Initial List of tuple is" , list_of_tuples)
print ( "\nTop 'K' elements are" , top_k)
|
Output
Initial List of tuple is [[('Name', 151)], [('ACe', 400)], [('TURN', 210)], [('RED', 1113)], [('YELLOW', 1)]]
Top 'K' elements are [('RED', 1113), ('ACe', 400), ('TURN', 210)]
This code first flattens the list of tuples using a list comprehension. It then sorts the flattened list by frequency in descending order using the sorted() function and a lambda function as the key parameter. Finally, it takes the first K elements from the sorted list using list slicing.
This approach is similar to the previous ones, but it does not use the chain.from_iterable() function to flatten the list of tuples. Instead, it uses a list comprehension to accomplish the same task.
Time complexity: O(nlogn)
Auxiliary Space: O(n)
Algorithm:
- Create an empty dictionary “freq_map” to store the frequency count of each tuple element.
- Loop through each tuple “t” in the input list “lst”.
- Check if the length of the tuple is either 1 or 2.
- Extract the element “elem” and frequency “freq” from the tuple “t”.
- If “elem” is not in the “freq_map”, add it as a key and initialize the value to “freq”.
- If “elem” is already in the “freq_map”, increment its value by “freq”.
- Create an empty list “result” to store the top K frequent elements.
- Loop through the “freq_map” dictionary and find the top K frequent elements using a temporary list “temp” and a variable “max_freq”.
- Append the top K frequent elements to the “result” list.
- Remove the top K frequent elements from the “freq_map” dictionary.
- Decrement the value of K by the number of elements removed.
- Return the “result” list.
Python3
lst = [[( 'Name' , 151 )], [( 'ACe' , 400 )], [( 'TURN' , 210 )], [( 'RED' , 1113 )], [( 'YELLOW' , 1 )]]
freq_map = {}
for t in lst:
if len (t) = = 1 :
elem, freq = t[ 0 ], 1
elif len (t) = = 2 :
elem, freq = t[ 0 ], t[ 1 ]
else :
raise ValueError( "Tuple should have 1 or 2 elements" )
if elem not in freq_map:
freq_map[elem] = freq
else :
freq_map[elem] + = freq
result = []
K = 3
while K > 0 and freq_map:
temp = []
max_freq = 0
for elem, freq in freq_map.items():
if freq > max_freq:
max_freq = freq
temp = [elem]
elif freq = = max_freq:
temp.append(elem)
result.extend(temp)
for elem in temp:
freq_map.pop(elem)
K - = len (temp)
print (result)
|
Output
[('Name', 151), ('ACe', 400), ('TURN', 210), ('RED', 1113), ('YELLOW', 1)]
Time Complexity: O(n^2), where n is the length of the input list.
Auxiliary Space: O(n), where n is the length of the input list.
Share your thoughts in the comments
Please Login to comment...