Python – Filter above Threshold size Strings
Sometimes, while working with huge amounts of data, we can have a problem in which we need to extract just specific-sized strings above a minimum threshold. This kind of problem can occur during validation cases across many domains. Let’s discuss certain ways to handle this in Python strings list.
Method #1: Using list comprehension + len()
The combination of the above functionalities can be used to perform this task. In this, we iterate for all the strings and return only above threshold strings checked using len() function.
Python3
test_list = [ 'gfg' , 'is' , 'best' , 'for' , 'geeks' ]
print ( "The original list : " + str (test_list))
thres = 4
res = [ele for ele in test_list if len (ele) > = thres]
print ( "The above Threshold size strings are : " + str (res))
|
Output :
The original list : ['gfg', 'is', 'best', 'for', 'geeks']
The above Threshold size strings are : ['best', 'geeks']
Time Complexity: O(n)
Auxiliary Space: O(n)
Method #2: Using filter() + lambda
The combination of the above functionalities can be used to perform this task. In this, we extract the elements using filter() and logic is compiled in a lambda function.
Python3
test_list = [ 'gfg' , 'is' , 'best' , 'for' , 'geeks' ]
print ( "The original list : " + str (test_list))
thres = 4
res = list ( filter ( lambda ele: len (ele) > = thres, test_list))
print ( "The above Threshold size strings are : " + str (res))
|
Output :
The original list : ['gfg', 'is', 'best', 'for', 'geeks']
The above Threshold size strings are : ['best', 'geeks']
Time Complexity: O(n)
Auxiliary Space: O(n)
Method #3: Using copy() + remove() + join() methods
Python3
test_list = [ 'gfg' , 'is' , 'best' , 'for' , 'geeks' ]
print ( "The original list : " + str (test_list))
thres = 4
for i in test_list.copy():
if len (i) < thres:
test_list.remove(i)
print ( "The above Threshold size strings are : " + ' ' .join(test_list))
|
Output
The original list : ['gfg', 'is', 'best', 'for', 'geeks']
The above Threshold size strings are : best geeks
Time Complexity: O(n) where n is the number of elements in the list “test_list”.copy() + remove() + join() methods performs n number of operations.
Auxiliary Space: O(n), extra space is required where n is the number of elements in the list
Method #4: Using numpy.array() and numpy.char.str_len()
Note: Install numpy module using command “pip install numpy”
Python3
import numpy as np
test_list = [ 'gfg' , 'is' , 'best' , 'for' , 'geeks' ]
print ( "The original list : " + str (test_list))
thres = 4
test_array = np.array(test_list)
lengths = np.char.str_len(test_array)
res = test_array[lengths > = thres]
print ( "The above Threshold size strings are : " + str (res))
|
Output
The original list : ['gfg', 'is', 'best', 'for', 'geeks']
The above Threshold size strings are : best geeks
Time Complexity: O(n)
Auxiliary Space: O(n)
In this method, we first convert the input list to a numpy array using numpy.array(). Then we use numpy.char.str_len() to get the length of each string in the array. Finally, we use numpy array indexing to filter the strings with lengths greater than or equal to the threshold. This method uses numpy’s built-in functions and can be more efficient for large input lists.
Method 5: using a for loop and a conditional statement
Steps:
- Initialize a list test_list with some strings.
- Print the original list using print() function.
- Initialize a variable thres with some threshold value.
- Initialize an empty list res.
- Use a for loop to iterate over each element in test_list.
- Use a conditional statement to check if the length of the current element is greater than or equal to the threshold value.
- If the condition is true, append the current element to the res list.
- After the loop, print the filtered list using print() function.
Python3
test_list = [ 'gfg' , 'is' , 'best' , 'for' , 'geeks' ]
print ( "The original list : " + str (test_list))
thres = 4
res = []
for ele in test_list:
if len (ele) > = thres:
res.append(ele)
print ( "The above Threshold size strings are : " + str (res))
|
Output
The original list : ['gfg', 'is', 'best', 'for', 'geeks']
The above Threshold size strings are : ['best', 'geeks']
Time complexity: O(N), where n is the length of the list.
Auxiliary space: O(K), where k is the number of elements that satisfy the condition.
Last Updated :
05 May, 2023
Like Article
Save Article
Share your thoughts in the comments
Please Login to comment...