Python – Split strings ignoring the space formatting characters

Given a String, Split into words ignoring space formatting characters like \n, \t, etc.

Input : test_str = ‘geeksforgeeks\n\r\\nt\t\n\t\tbest\r\tfor\f\vgeeks’
Output : [‘geeksforgeeks’, ‘best’, ‘for’, ‘geeks’]
Explanation : All space characters are used as parameter to join.

Input : test_str = ‘geeksforgeeks\n\r\\nt\t\n\t\tbest’
Output : [‘geeksforgeeks’, ‘best’]
Explanation : All space characters are used as parameter to join.

Method 1: Using re.split()

In this, we employ appropriate regex composed of space characters and use split() to perform split on set of regex characters.

Python3

# Python3 code to demonstrate working of 
# Split Strings ignoring Space characters
# Using re.split()

import re
 
# initializing string

test_str = 'geeksforgeeks\n\r\t\t\nis\t\tbest\r\tfor geeks'
 
# printing original string

print("The original string is : " + str(test_str))
 
# space regex with split returns the result

res = re.split(r'[\n\t\f\v\r ]+', test_str)

# printing result 

print("The split string : " + str(res))

Output:

The original string is : geeksforgeeks

        
is        best
    for geeks
The split string : ['geeksforgeeks', 'is', 'best', 'for', 'geeks']

Time Complexity: O(n)

Auxiliary Space: O(n)

Method 2: Using split()

The split() function by-default splits the string on white-spaces.

Python3

# Python3 code to demonstrate working of 
# Split Strings ignoring Space characters
# Using split()
 
# initializing string

test_str = 'geeksforgeeks\n\r\t\t\nis\t\tbest\r\tfor geeks'
 
# printing original string

print("The original string is : " + str(test_str))

# printing result 

print("The split string : " + str(test_str.split()))

Output:

The original string is : geeksforgeeks

        
is        best
    for geeks
The split string : ['geeksforgeeks', 'is', 'best', 'for', 'geeks']

Time Complexity: O(n)

Auxiliary Space: O(n)

Approach#3: Using string.split() method with filter()

Use the string.split() method to split the input string into substrings.
Use the filter() function to remove any empty strings from the resulting list of substrings.
Return the filtered list of substrings.

Python3

# Python program for the above approach
 
# Function to split the string

def split_string(test_str):

    substrings = test_str.split()

    substrings = list(filter(lambda s: s.strip(), substrings))

    return substrings
 
# Driver Code

test_str = 'geeksforgeeks\n\r\t\t\nis\t\tbest\r\tfor geeks'

print(split_string(test_str))

Output

['geeksforgeeks', 'is', 'best', 'for', 'geeks']

Time Complexity: O(n), where n is the length of the input string. The split() method takes linear time in the length of the string.

Space Complexity: O(n), where n is the length of the input string. The space used by the resulting list of substrings is proportional to the length of the input string.

Approach#4

Method 4 : use the itertools module to group contiguous non-space characters together and then join them into separate substrings.

Steps :

Import the itertools module to work with iterators and grouping functions.
Use the itertools.groupby() function to group contiguous non-space characters in the input string.
Use a list comprehension to join the characters in each group into separate substrings.
Print the resulting list of substrings

Python3

import itertools
 
# initializing string

test_str = 'geeksforgeeks\n\r\t\t\nis\t\tbest\r\tfor geeks'
 
# splitting string using itertools module

result = [''.join(group) for is_space, group in itertools.groupby(test_str, lambda x: x.isspace()) if not is_space]
 
# printing result

print("The split string : " + str(result))

Output

The split string : ['geeksforgeeks', 'is', 'best', 'for', 'geeks']

Time complexity: The itertools.groupby() function has a linear time complexity in the length of the input string, so this approach has a time complexity of O(n), where n is the length of the input string.

Auxiliary space: This approach creates a list to store the resulting substrings, so it has an auxiliary space complexity of O(n), where n is the length of the input string.

Article Tags :

Python

Python Programs

Python string-programs