Python program to Sort a List of Strings by the Number of Unique Characters
Last Updated :
22 Apr, 2023
Given a list of strings. The task is to sort the list of strings by the number of unique characters.
Examples:
Input : test_list = [‘gfg’, ‘best’, ‘for’, ‘geeks’],
Output : [‘gfg’, ‘for’, ‘best’, ‘geeks’]
Explanation : 2, 3, 4, 4 are unique elements in lists.
Input : test_list = [‘gfg’, ‘for’, ‘geeks’],
Output : [‘gfg’, ‘for’, ‘geeks’]
Explanation : 2, 3, 4 are unique elements in lists.
Method #1 : Using sort() + len() + set()
In this, we perform task of sorting using sort(), and len and sort functions are used to get length of unique characters in string.
Python3
def hlper_fnc(ele):
return len ( list ( set (ele)))
test_list = [ 'gfg' , 'best' , 'for' , 'geeks' ]
print ( "The original list is : " + str (test_list))
test_list.sort(key = hlper_fnc)
print ( "Sorted List : " + str (test_list))
|
Output
The original list is : ['gfg', 'best', 'for', 'geeks']
Sorted List : ['gfg', 'for', 'best', 'geeks']
Time Complexity: O(nlogn)
Space Complexity: O(n)
Method #2 : Using sorted() + len() + set() + lambda
Similar to above method, difference being not inplace sort, and also uses lambda function for performing task.
Python3
test_list = [ 'gfg' , 'best' , 'for' , 'geeks' ]
print ( "The original list is : " + str (test_list))
res = sorted (test_list, key = lambda sub : len ( list ( set (sub))))
print ( "Sorted List : " + str (res))
|
Output
The original list is : ['gfg', 'best', 'for', 'geeks']
Sorted List : ['gfg', 'for', 'best', 'geeks']
Time Complexity: O(nlogn)
Space Complexity: O(n)
Python3
def hlper_fnc(ele):
return len ( list ( set (ele)))
test_list = [ 'gfg' , 'best' , 'for' , 'geeks' ]
print ( "The original list is : " + str (test_list))
test_list = sorted (test_list, key = hlper_fnc)
print ( "Sorted List : " + str (test_list))
|
Output
The original list is : ['gfg', 'best', 'for', 'geeks']
Sorted List : ['gfg', 'for', 'best', 'geeks']
Method 4:Using Counter() function
Python3
from collections import Counter
def hlper_fnc(ele):
freq = Counter(ele)
return len (freq)
test_list = [ 'gfg' , 'best' , 'for' , 'geeks' ]
print ( "The original list is : " + str (test_list))
test_list.sort(key = hlper_fnc)
print ( "Sorted List : " + str (test_list))
|
Output
The original list is : ['gfg', 'best', 'for', 'geeks']
Sorted List : ['gfg', 'for', 'best', 'geeks']
Time Complexity: O(nlogn)
Auxiliary Space: O(n)
Method 5: Using NumPy
Here’s an approach using NumPy, which creates a NumPy array from the input list and then sorts the array based on the number of unique characters in each string.
Python3
import numpy as np
def sort_list_unique_chars(lst):
arr = np.array(lst)
unique_char_count = np.array([ len (np.unique(i)) for i in arr])
sorted_indices = np.argsort(unique_char_count)
sorted_arr = arr[sorted_indices]
sorted_list = sorted_arr.tolist()
return sorted_list
test_list = [ 'gfg' , 'best' , 'for' , 'geeks' ]
print (sort_list_unique_chars(test_list))
|
Output:
[‘gfg’, ‘for’, ‘best’, ‘geeks’]
Time Complexity: O(nlogn)
Auxiliary Space: O(n)
Method 6: Using QuickSort and count of unique characters
- Define a helper function called count_unique_chars that takes a string and returns the count of unique characters in the string using the set and len functions.
- Define a partition function called partition that takes a list, a low index, and a high index, and returns the index of the pivot element after partitioning the list around the pivot element. The pivot element is the first element in the list (arr[low]). We set two pointers i and j initially pointing to the low and high indices, respectively. We then iterate through the list using the two pointers, swapping elements if necessary, until the pointers cross each other. Finally, we swap the pivot element with the element at index j and return j.
- Define a quicksort function called quicksort that takes a list, a low index, and a high index. If the low index is less than the high index, we partition the list around a pivot index using the partition function and recursively call quicksort on the two resulting sublists to sort them.
- Initialize a test list of strings called test_list.
- Call quicksort on test_list, passing in 0 as the low index and len(test_list)-1 as the high index, to sort the list based on count of unique characters.
- Print the sorted list using the print function.
Python3
def count_unique_chars(s):
return len ( set (s))
def partition(arr, low, high):
pivot = arr[low]
i = low + 1
j = high
while True :
while i < = j and count_unique_chars(arr[i]) < = count_unique_chars(pivot):
i + = 1
while i < = j and count_unique_chars(arr[j]) > = count_unique_chars(pivot):
j - = 1
if i < = j:
arr[i], arr[j] = arr[j], arr[i]
else :
break
arr[low], arr[j] = arr[j], arr[low]
return j
def quicksort(arr, low, high):
if low < high:
pivot_index = partition(arr, low, high)
quicksort(arr, low, pivot_index - 1 )
quicksort(arr, pivot_index + 1 , high)
test_list = [ 'gfg' , 'best' , 'for' , 'geeks' ]
quicksort(test_list, 0 , len (test_list) - 1 )
print (test_list)
|
Output
['gfg', 'for', 'geeks', 'best']
The time complexity of the partition function is O(n), and the quicksort function calls the partition function recursively on each sub-list, so the time complexity of quicksort is O(n log n) in the average case and O(n^2) in the worst case.
Auxiliary Space:
The partition function uses a constant amount of auxiliary space, and quicksort uses O(log n) auxiliary space in the average case and O(n) in the worst case due to the recursion depth. Therefore, the overall space complexity of this method is O(log n) in the average case and O(n) in the worst case.
Like Article
Suggest improvement
Share your thoughts in the comments
Please Login to comment...