Python – Sort by Uppercase Frequency

Last Updated : 08 Mar, 2023

Given a list of strings, perform sorting by frequency of uppercase characters.

Input : test_list = [“Gfg”, “is”, “FoR”, “GEEKS”]
Output : [‘is’, ‘Gfg’, ‘FoR’, ‘GEEKS’]
Explanation : 0, 1, 2, 5 uppercase letters in strings respectively.

Input : test_list = [“is”, “GEEKS”]
Output : [‘is’, ‘GEEKS’]
Explanation : 0, 5 uppercase letters in strings respectively.

Method #1 : Using sort() + isupper()

In this, we perform task of checking for uppercase using isupper(), and sort() to perform task of sorting.

Python3

# Python3 code to demonstrate working of
# Sort by Uppercase Frequency
# Using isupper() + sort()
 
 
# helper function
def upper_sort(sub):
 
    # len() to get total uppercase characters
    return len([ele for ele in sub if ele.isupper()])
 
 
# initializing list
test_list = ["Gfg", "is", "BEST", "FoR", "GEEKS"]
 
# printing original list
print("The original list is: " + str(test_list))
 
# using external function to perform sorting
test_list.sort(key=upper_sort)
 
# printing result
print("Elements after uppercase sorting: " + str(test_list))

Output

The original list is: ['Gfg', 'is', 'BEST', 'FoR', 'GEEKS']
Elements after uppercase sorting: ['is', 'Gfg', 'FoR', 'BEST', 'GEEKS']

Time Complexity: O(n*nlogn), where n is the length of the input list. This is because we’re using the built-in sorted() function which has a time complexity of O(nlogn) in the worst case and isupper has a time complexity of O(n) in the worst case.
Auxiliary Space: O(1), as we’re not using any additional space other than the input list itself.

Method #2 : Using sorted() + lambda function

In this, we perform the task of sorting using sorted(), and lambda function is used rather than external sort() function to perform task of sorting.

Python3

# Python3 code to demonstrate working of
# Sort by Uppercase Frequency
# Using sorted() + lambda function
 
 
# initializing list
test_list = ["Gfg", "is", "BEST", "FoR", "GEEKS"]
 
# printing original list
print("The original list is: " + str(test_list))
 
# sorted() + lambda function used to solve problem
res = sorted(test_list, key=lambda sub: len(
    [ele for ele in sub if ele.isupper()]))
 
# printing result
print("Elements after uppercase sorting: " + str(res))

Output

The original list is: ['Gfg', 'is', 'BEST', 'FoR', 'GEEKS']
Elements after uppercase sorting: ['is', 'Gfg', 'FoR', 'BEST', 'GEEKS']

Method #3 : Using Counter

This approach uses the Counter method from the collections module to get the frequency count of uppercase letters in each string and then uses the sorted method to sort the list based on these counts.

Python3

# Method #3 : Using Counter
 
# Python3 code to demonstrate working of
# Sort by Uppercase Frequency
# Using Counter
 
from collections import Counter
 
# initializing list
test_list = ["Gfg", "is", "BEST", "FoR", "GEEKS"]
 
# printing original list
print("The original list is: " + str(test_list))
 
# Using Counter to get uppercase frequency count for each string
uppercase_counts = [Counter(string)['A']+Counter(string)['B']+Counter(string)['C']+Counter(string)['D']+Counter(string)['E']+Counter(string)['F']+Counter(string)['G']+Counter(string)['H']+Counter(string)['I']+Counter(string)['J']+Counter(string)['K']+Counter(string)['L']+Counter(string)['M']+Counter(string)['N']+Counter(string)['O']+Counter(string)['P']+Counter(string)['Q']+Counter(string)['R']+Counter(string)['S']+Counter(string)['T']+Counter(string)['U']+Counter(string)['V']+Counter(string)['W']+Counter(string)['X']+Counter(string)['Y']+Counter(string)['Z'] for string in test_list]
 
# Using zip and sorted to sort the list based on uppercase frequency count
res = [x for _, x in sorted(zip(uppercase_counts, test_list))]
 
# printing result
print("Elements after uppercase sorting: " + str(res))

Output

The original list is: ['Gfg', 'is', 'BEST', 'FoR', 'GEEKS']
Elements after uppercase sorting: ['is', 'Gfg', 'FoR', 'BEST', 'GEEKS']

Time Complexity: O(n^2), as for each string in the list, we are checking the frequency of uppercase letters in that string using the Counter method which takes O(n) time, and we are doing it for n strings. So, the total time complexity will be O(n^2).

Auxiliary Space: O(n), as we are using a list of size n to store the uppercase frequency count for each string.

Method#4: using re module

Step-by-step algorithm:

Define a list of strings to be sorted.
Define a function uppercase_frequency() that takes a string as input and returns the frequency of uppercase characters in the string. This is done using the findall() function from the re module to find all uppercase letters in the string, and then using the len() function to count the number of matches.
Use the sorted() function to sort the list of strings based on the result of the uppercase_frequency() function. The key parameter is set to the uppercase_frequency() function to indicate that the sorting should be done based on the result of this function for each string.
Print the original list and the sorted list.

Python3

import re
 
# Define a list of strings to be sorted
test_list = ["Gfg", "is", "BEST", "FoR", "GEEKS"]
 
# Define a function to calculate the frequency of uppercase characters in a string
def uppercase_frequency(s):
    return len(re.findall(r'[A-Z]', s))
 
# Use the sorted() function to sort the list of strings by their uppercase frequency
sorted_list = sorted(test_list, key=uppercase_frequency)
 
# Print the original list and the sorted list
print("The original list is: " + str(test_list))
print("Elements after uppercase sorting: " + str(sorted_list))

Output

The original list is: ['Gfg', 'is', 'BEST', 'FoR', 'GEEKS']
Elements after uppercase sorting: ['is', 'Gfg', 'FoR', 'BEST', 'GEEKS']

Time complexity:
The time complexity of the uppercase_frequency() function is O(n), where n is the length of the input string, since it uses the findall() function to search the entire string for uppercase letters. The time complexity of the sorted() function is O(n log n), where n is the length of the list, since it performs a comparison-based sort. Therefore, the overall time complexity of the code is O(n log n).

Auxiliary space:
The auxiliary space complexity of the uppercase_frequency() function is O(n), where n is the length of the input string, since it creates a list of all uppercase letters in the string. The sorted() function uses O(log n) space for the recursive calls in the sorting algorithm, but this is negligible compared to the space used by the input list and the uppercase_frequency() function. Therefore, the overall auxiliary space complexity of the code is O(n).

Suggest improvement

Python - Sort Strings by maximum frequency character

Share your thoughts in the comments