Open In App

Python | Find top K frequent elements from a list of tuples

Last Updated : 09 May, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Given a list of tuples with word as first element and its frequency as second element, the task is to find top k frequent element. Below are some ways to above achieve the above task. 

Method #1: Using defaultdict 

Python3




# Python code to find top 'k' frequent element
 
# Importing
import collections
from operator import itemgetter
from itertools import chain
 
# Input list initialization
Input =[[('Name', 151)], [('ACe', 400)],
        [('TURN', 210)], [('RED', 1113)],
        [('YELLOW', 1)]]
 
# K initialization
K = 3
 
# Using defaultdict to  find top 'k' frequent element
dict_ = collections.defaultdict(list)
new_list = list(chain.from_iterable(Input))
 
for elem in new_list:
    dict_[elem[0]].append(elem[1])
 
res = {k: sum(v) for k, v in dict_.items()}
 
# Using sorted
Output = sorted(res.items(), key = itemgetter(1),
                             reverse = True)[0:K]
 
# printing output
print("Initial List of tuple is", Input)
print("\nTop 'K' elements are", Output)


The time complexity is O(n log n), where ‘n’ is the total number of elements in the input list.

The auxiliary space complexity of the given code is O(n), where ‘n’ is the total number of elements in the input list. 

  Method #2: Using itertools and sorted 

Python3




# Python code to find top 'k' frequent element
from operator import itemgetter
from itertools import chain
 
# Input list initialization
Input =[[('Name', 151)], [('ACe', 400)],
        [('TURN', 210)], [('RED', 1113)],
        [('YELLOW', 1)]]
 
# k initialization
K = 3
 
# Finding top 'k' frequent element
# without using collection
Output = sorted(list(chain.from_iterable(Input)),
        key = itemgetter(1), reverse = True)[0:K]
 
# Printing Output
print("Initial List of tuple is", Input)
print("\nTop 'K' elements are", Output)


The time complexity is O(nlogn), where ‘n’ is the total number of elements in the Input list. 

The auxiliary space is O(n), where ‘n’ is the total number of elements in the Input list. 

Method #3: Using list comprehension + lambda 

To find the top K frequent elements from the given list of tuples , you can use a list comprehension to flatten the list of tuples.

Python3




# Python code to find top 'k' frequent element
# Input list initialization
 
list_of_tuples = [[('Name', 151)], [('ACe', 400)],
                  [('TURN', 210)], [('RED', 1113)],
                  [('YELLOW', 1)]]
# k initialization
K = 3
#flattens the list of tuples using a list comprehension.
flattened_list = [elem for sublist in list_of_tuples for elem in sublist]
#sorts the flattened list by frequency in descending order using the sorted() function
sorted_list = sorted(flattened_list, key=lambda x: x[1], reverse=True)
#first K elements from the sorted list using list slicing.
top_k = sorted_list[:K]
# Printing Output
print("Initial List of tuple is", list_of_tuples)
print("\nTop 'K' elements are", top_k)
#This code is contributed by Edula Vinay Kumar Reddy


Output

Initial List of tuple is [[('Name', 151)], [('ACe', 400)], [('TURN', 210)], [('RED', 1113)], [('YELLOW', 1)]]

Top 'K' elements are [('RED', 1113), ('ACe', 400), ('TURN', 210)]

This code first flattens the list of tuples using a list comprehension. It then sorts the flattened list by frequency in descending order using the sorted() function and a lambda function as the key parameter. Finally, it takes the first K elements from the sorted list using list slicing.

This approach is similar to the previous ones, but it does not use the chain.from_iterable() function to flatten the list of tuples. Instead, it uses a list comprehension to accomplish the same task.

Time complexity: O(nlogn)

Auxiliary Space: O(n)

Algorithm:

  1. Create an empty dictionary “freq_map” to store the frequency count of each tuple element.
  2. Loop through each tuple “t” in the input list “lst”.
  3. Check if the length of the tuple is either 1 or 2.
  4. Extract the element “elem” and frequency “freq” from the tuple “t”.
  5. If “elem” is not in the “freq_map”, add it as a key and initialize the value to “freq”.
  6. If “elem” is already in the “freq_map”, increment its value by “freq”.
  7. Create an empty list “result” to store the top K frequent elements.
  8. Loop through the “freq_map” dictionary and find the top K frequent elements using a temporary list “temp” and a variable “max_freq”.
  9. Append the top K frequent elements to the “result” list.
  10. Remove the top K frequent elements from the “freq_map” dictionary.
  11. Decrement the value of K by the number of elements removed.
  12. Return the “result” list.

Python3




lst = [[('Name', 151)], [('ACe', 400)], [('TURN', 210)], [('RED', 1113)], [('YELLOW', 1)]]
freq_map = {}
for t in lst:
    if len(t) == 1:
        elem, freq = t[0], 1
    elif len(t) == 2:
        elem, freq = t[0], t[1]
    else:
        raise ValueError("Tuple should have 1 or 2 elements")
    if elem not in freq_map:
        freq_map[elem] = freq
    else:
        freq_map[elem] += freq
 
result = []
K = 3
while K > 0 and freq_map:
    temp = []
    max_freq = 0
    for elem, freq in freq_map.items():
        if freq > max_freq:
            max_freq = freq
            temp = [elem]
        elif freq == max_freq:
            temp.append(elem)
    result.extend(temp)
    for elem in temp:
        freq_map.pop(elem)
    K -= len(temp)
 
print(result)


Output

[('Name', 151), ('ACe', 400), ('TURN', 210), ('RED', 1113), ('YELLOW', 1)]

Time Complexity: O(n^2), where n is the length of the input list.

Auxiliary Space: O(n), where n is the length of the input list. 



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads