Python | Extract Strings with only Alphabets
Sometimes, while working with Python lists, we can have a problem in which we need to extract only those strings which contain only alphabets and discard those which include digits. This has applications in day-day programming and web development domain. Lets discuss certain ways in which this task can be performed.
Method #1 : Using isalpha() + list comprehension
The combination of above functions can be used to perform this task. In this, we extract the string which are alphabets only using isalpha() and compile whole logic using list comprehension.
Python3
test_list = [ 'gfg' , 'is23' , 'best' , 'for2' , 'geeks' ]
print ( "The original list is : " + str (test_list))
res = [sub for sub in test_list if sub.isalpha()]
print ( "Strings after filtering : " + str (res))
|
Output
The original list is : ['gfg', 'is23', 'best', 'for2', 'geeks']
Strings after filtering : ['gfg', 'best', 'geeks']
Time Complexity: O(n)
Auxiliary Space: O(k)
Method #2: Using filter() + lambda
The combination of the above methods can be used to perform this task. In this, we perform filtering using filter() and logic for an extension to all strings is done using lambda.
Python3
test_list = [ 'gfg' , 'is23' , 'best' , 'for2' , 'geeks' ]
print ( "The original list is : " + str (test_list))
res = list ( filter ( lambda sub: sub.isalpha(), test_list))
print ( "Strings after filtering : " + str (res))
|
Output
The original list is : ['gfg', 'is23', 'best', 'for2', 'geeks']
Strings after filtering : ['gfg', 'best', 'geeks']
Time Complexity: O(n)
Auxiliary Space: O(n)
Method #3: Without using any builtin methods
Python3
def fun(s):
c = 0
up = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
lo = "abcdefghijklmnopqrstuvwxyz"
for i in s:
if i in up or i in lo:
c + = 1
if (c = = len (s)):
return True
return False
test_list = [ 'gfg' , 'is23' , 'best' , 'for2' , 'geeks' ]
print ( "The original list is : " + str (test_list))
c = 0
res = []
for i in test_list:
if (fun(i)):
res.append(i)
print ( "Strings after filtering : " + str (res))
|
Output
The original list is : ['gfg', 'is23', 'best', 'for2', 'geeks']
Strings after filtering : ['gfg', 'best', 'geeks']
Time Complexity: O(n) where n is the total number of values in the list “test_list”.
Auxiliary Space: O(n) where n is the total number of values in the list “test_list”.
Method 4: Using enumerate function
Python3
lst = [ 'gfg' , 'is23' , 'best' , 'for2' , 'geeks' ]
res = [i for a,i in enumerate (lst) if i.isalpha()]
print ( str (res))
|
Output
['gfg', 'best', 'geeks']
Time Complexity: O(n), where n is the length of the input list. This is because we’re using the enumerate function which has a time complexity of O(n) in the worst case.
Auxiliary Space: O(n), as we’re using additional space res other than the input list itself with the same size of input list.
Method #5 : Using operator.countOf() method.
Python3
import operator as op
def fun(s):
c = 0
up = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
lo = "abcdefghijklmnopqrstuvwxyz"
for i in s:
if op.countOf(up, i) > 0 or op.countOf(lo, i) > 0 :
c + = 1
if (c = = len (s)):
return True
return False
test_list = [ 'gfg' , 'is23' , 'best' , 'for2' , 'geeks' ]
print ( "The original list is : " + str (test_list))
c = 0
res = []
for i in test_list:
if (fun(i)):
res.append(i)
print ( "Strings after filtering : " + str (res))
|
Output
The original list is : ['gfg', 'is23', 'best', 'for2', 'geeks']
Strings after filtering : ['gfg', 'best', 'geeks']
Time Complexity: O(n)
Auxiliary Space: O(1)
Method #6: Using itertools.filterfalse() method
Python3
import itertools
test_list = [ 'gfg' , 'is23' , 'best' , 'for2' , 'geeks' ]
print ( "The original list is : " + str (test_list))
res = list (itertools.filterfalse( lambda sub: not sub.isalpha(), test_list))
print ( "Strings after filtering : " + str (res))
|
Output
The original list is : ['gfg', 'is23', 'best', 'for2', 'geeks']
Strings after filtering : ['gfg', 'best', 'geeks']
Time Complexity:O(N)
Auxiliary Space:O(N)
Method #7:Using re
Python3
import re
def is_alpha(word):
return bool (re.match( '^[a-zA-Z]+$' , word))
test_list = [ 'gfg' , 'is23' , 'best' , 'for2' , 'geeks' ]
print ( "The original list is : " + str (test_list))
res = [word for word in test_list if is_alpha(word)]
print ( "Strings after filtering : " + str (res))
|
Output
The original list is : ['gfg', 'is23', 'best', 'for2', 'geeks']
Strings after filtering : ['gfg', 'best', 'geeks']
Time Complexity: O(n)
Auxiliary Space: O(1)
In this method, we use re.match() method from the re library to check if the word consists of only alphabet characters or not. The regular expression pattern ‘^[a-zA-Z]+$’ matches the start (^) and end ($) of the word, and between them, there must be one or more alphabet characters (a-z or A-Z) only.
Method #8:Using reduce()
Algorithm
- Import the functools module.
- Initialize a list test_list with some strings containing alphanumeric characters.
- Print the original list.
- Use reduce() to iterate over each string in test_list, and build a new list res that only contains the strings that are made up entirely of alphabetical characters.
- In the reduce() function, use a lambda function that takes two arguments: an accumulator list acc, and the current string sub.
- If sub contains only alphabetical characters, append it to acc. Otherwise, return acc unchanged.
- The initial value of the accumulator list acc is an empty list [].
- Store the result of reduce() in res.
- Print the filtered list res.
- Note that this algorithm assumes that you want to keep only the strings that are made up entirely of alphabetical characters. If you want to keep strings that have at least one alphabetical character, you can modify the lambda function to check if any charac
Python3
import functools
test_list = [ 'gfg' , 'is23' , 'best' , 'for2' , 'geeks' ]
print ( "The original list is : " + str (test_list))
res = functools. reduce ( lambda acc, sub: acc +
[sub] if sub.isalpha() else acc, test_list, [])
print ( "Strings after filtering : " + str (res))
|
Output
The original list is : ['gfg', 'is23', 'best', 'for2', 'geeks']
Strings after filtering : ['gfg', 'best', 'geeks']
Time complexity: O(N*M)
The reduce() function iterates over each string in test_list.
For each string, the lambda function checks if the string is alphabetical or not, which takes O(n) time where n is the length of the string.
Therefore, the overall time complexity of the code is O(n * m), where n is the number of strings in test_list and m is the maximum length of a string in test_list.
Auxiliary Space: O(N)
The space complexity of the code is O(n), where n is the number of strings in test_list.
This is because we are storing the filtered strings in a new list res.
The space used by the lambda function and reduce() itself is negligible compared to the size of test_list.
Last Updated :
21 Mar, 2023
Like Article
Save Article
Share your thoughts in the comments
Please Login to comment...