Open In App

Python – Split String on all punctuations

Improve
Improve
Like Article
Like
Save
Share
Report

Given a String, Split the String on all the punctuations.

Input : test_str = 'geeksforgeeks! is-best' 
Output : ['geeksforgeeks', '!', 'is', '-', 'best'] 
Explanation : Splits on '!' and '-'. 

Input : test_str = 'geek-sfo, rgeeks! is-best' 
Output : ['geek', '-', 'sfo', ', ', 'rgeeks', '!', 'is', '-', 'best'] 
Explanation : Splits on '!', ', ' and '-'.

Method : Using regex + findall()

This is one way in which this problem can be solved. In this, we construct appropriate regex and task of segregating and split is done by findall().

Python3




# Python3 code to demonstrate working of
# Split String on all punctuations
# using regex + findall()
import re
 
# initializing string
test_str = 'geeksforgeeks ! is-best, for @geeks'
 
# printing original String
print("The original string is : " + str(test_str))
 
# using findall() to get all regex matches.
res = re.findall( r'\w+|[^\s\w]+', test_str)
 
# printing result
print("The converted string : " + str(res))


Output

The original string is : geeksforgeeks! is-best, for @geeks
The converted string : ['geeksforgeeks', '!', 'is', '-', 'best', ', ', 'for', '@', 'geeks']

Time Complexity: O(n)

Auxiliary Space: O(n)

Method : Using replace() and split() methods

Python3




# Python3 code to demonstrate working of
# Split String on all punctuations
 
# initializing string
test_str = 'geeksforgeeks ! is-best, for @geeks'
 
# printing original String
print("The original string is : " + str(test_str))
 
# import string library function
import string
     
# Storing the sets of punctuation in variable result
result = string.punctuation
 
for i in test_str:
    if i in result:
        test_str=test_str.replace(i,"*"+i+"*")
res=test_str.split("*")
# printing result
print("The converted string : " + str(res))


Output

The original string is : geeksforgeeks ! is-best, for @geeks
The converted string : ['geeksforgeeks ', '!', ' is', '-', 'best', ',', ' for ', '@', 'geeks']

Method #3: Using re.sub() method with lambda function

Steps:

  1. Import the required module ‘re‘ and initialize the input string “test_str” and print the original string “test_str“.
  2. Apply the ‘re.sub()’ method to add space before and after each punctuation symbol.
  3. Assign the result to the variable “res” and split the modified string into a list of words.
  4. Print the resulting list of words.

Python3




# Python3 code to demonstrate working of
# Split String on all punctuations
# Using re.sub() method with lambda function:
import re
 
# initializing string
test_str = 'geeksforgeeks ! is-best, for @geeks'
 
# printing original String
print("The original string is : " + str(test_str))
 
res = re.sub(r'(\W+)', lambda x: ' '+x.group(0)+' ', test_str).split()
 
# printing result
print("The converted string : " + str(res))


Output

The original string is : geeksforgeeks ! is-best, for @geeks
The converted string : ['geeksforgeeks', '!', 'is', '-', 'best', ',', 'for', '@', 'geeks']

Time complexity: O(n), as ‘re.sub()’ method has a time complexity of O(n) where n is the length of the input string “test_str”

Space complexity: O(n), where n is the length of the input string “test_str”. 



Last Updated : 07 Apr, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads