Python | Sort given list by frequency and remove duplicates
Last Updated :
24 Mar, 2023
Problems associated with sorting and removal of duplicates is quite common in development domain and general coding as well. The sorting by frequency has been discussed, but sometimes, we even wish to remove the duplicates without using more LOC’s and in a shorter way. Let’s discuss certain ways in which this can be done.
Method #1 : Using count() + set() + sorted() The sorted function can be used to sort the elements as desired, the frequency can be computed using the count function and removal of duplicates can be handled using the set function.
Python3
test_list = [ 5 , 6 , 2 , 5 , 3 , 3 , 6 , 5 , 5 , 6 , 5 ]
print ( "The original list : " + str (test_list))
res = sorted ( set (test_list), key = lambda ele: test_list.count(ele))
print ( "The list after sorting and removal : " + str (res))
|
Output
The original list : [5, 6, 2, 5, 3, 3, 6, 5, 5, 6, 5]
The list after sorting and removal : [2, 3, 6, 5]
Time Complexity: O(nlogn), where n is the number of elements in the list “test_list”.
Auxiliary Space: O(n), where n is the number of elements in the list “test_list”.
Method #2 : Using Counter.most_common() + list comprehension If one has a particular use case of sorting by the decreasing order of frequency, one can also use most-common function of Counter library to get frequency part.
Python3
from collections import Counter
test_list = [ 5 , 6 , 2 , 5 , 3 , 3 , 6 , 5 , 5 , 6 , 5 ]
print ( "The original list : " + str (test_list))
res = [key for key, value in Counter(test_list).most_common()]
print ( "The list after sorting and removal : " + str (res))
|
Output
The original list : [5, 6, 2, 5, 3, 3, 6, 5, 5, 6, 5]
The list after sorting and removal : [5, 6, 3, 2]
Method #3 : Using itertools
To sort a list by frequency and remove duplicates using the itertools library in Python, you can do the following:
Python3
from itertools import groupby
test_list = [ 5 , 6 , 2 , 5 , 3 , 3 , 6 , 5 , 5 , 6 , 5 ]
print ( "The original list : " + str (test_list))
groups = [( len ( list (group)), key) for key, group in groupby( sorted (test_list))]
groups.sort(reverse = True )
res = [key for count, key in groups]
print ( "The list after sorting and removal : " + str (res[:: - 1 ]))
|
Output
The original list : [5, 6, 2, 5, 3, 3, 6, 5, 5, 6, 5]
The list after sorting and removal : [2, 3, 6, 5]
Explanation:
- First, we import the groupby function from the itertools library.
- Then, we initialize the list and group the elements in the list by their frequency using the groupby function. We pass the sorted list to the groupby function to ensure that the elements are grouped correctly.
- Next, we create a list of tuples where each tuple consists of the frequency of an element and the element itself. We use the len function to get the frequency of each element and the key variable to get the element itself.
- We sort the list of tuples by frequency in descending order using the sort function with the reverse parameter set to True.
- Finally, we create a new list with the elements in each group, starting with the group with the highest frequency. We use a list comprehension to iterate over the tuples in the sorted list and extract the element from each tuple.
The time complexity of this approach is O(n log n), as the groupby function has a time complexity of O(n) and the sort function has a time complexity of O(n log n).
The auxiliary space of this approach is O(n), as we create a new list with the same number of elements as the original list.
Method #4 : Using operator.countOf() + set() + sorted() The sorted function can be used to sort the elements as desired, the frequency can be computed using the countOf function and removal of duplicates can be handled using the set function.
Python3
import operator as op
test_list = [ 5 , 6 , 2 , 5 , 3 , 3 , 6 , 5 , 5 , 6 , 5 ]
print ( "The original list : " + str (test_list))
res = sorted ( set (test_list), key = lambda ele: op.countOf(test_list,ele))
print ( "The list after sorting and removal : " + str (res))
|
Output
The original list : [5, 6, 2, 5, 3, 3, 6, 5, 5, 6, 5]
The list after sorting and removal : [2, 3, 6, 5]
Time Complexity: O(n)
Auxiliary Space: O(n)
Share your thoughts in the comments
Please Login to comment...