Python NLTK | nltk.tokenize.SExprTokenizer()

Last Updated : 12 Jun, 2019

With the help of nltk.tokenize.SExprTokenizer() method, we are able to extract the tokens from string of characters or numbers by using tokenize.SExprTokenizer() method. It actually looking for proper brackets to make tokens.

Syntax : tokenize.SExprTokenizer()
Return : Return the tokens from a string of characters or numbers.

Example #1 :
In this example we can see that by using tokenize.SExprTokenizer() method, we are able to extract the tokens from stream of characters or numbers by taking brackets in consideration.

# import SExprTokenizer() method from nltk 
from nltk.tokenize import SExprTokenizer 
     
# Create a reference variable for Class SExprTokenizer 
tk = SExprTokenizer() 
     
# Create a string input 
gfg = "( a * ( b + c ))ab( a-c )"
     
# Use tokenize method 
geek = tk.tokenize(gfg) 
     
print(geek) 

Output :

[‘( a * ( b+c ))’, ‘ab’, ‘( a-c )’]

Example #2 :

# import SExprTokenizer() method from nltk 
from nltk.tokenize import SExprTokenizer 
     
# Create a reference variable for Class SExprTokenizer 
tk = SExprTokenizer() 
     
# Create a string input 
gfg = "(a b) c d (e f)"
     
# Use tokenize method 
geek = tk.tokenize(gfg) 
     
print(geek) 

Output :

[‘(a b)’, ‘c’, ‘d’, ‘(e f)’]

Suggest improvement

Python NLTK | nltk.tokenize.SpaceTokenizer()

Share your thoughts in the comments

Python NLTK | nltk.tokenize.SExprTokenizer()

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?