Skip to content
Related Articles

Related Articles

Improve Article
Save Article
Like Article

Python | Perform Sentence Segmentation Using Spacy

  • Last Updated : 05 Sep, 2020

The process of deciding from where the sentences actually start or end in NLP or we can simply say that here we are dividing a paragraph based on sentences. This process is known as Sentence Segmentation. In Python, we implement this part of NLP using the spacy library.

Spacy is used for Natural Language Processing in Python.

 Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.  

To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning - Basic Level Course

To use this library in our python program we first need to install it.



Command to install this library:

pip install spacy
python -m spacy download en_core_web_sm
Here en_core_web_sm  means core English Language available online of small size.

Example:

we have the following paragraph:
"I Love Coding. Geeks for Geeks helped me in this regard very much. I Love Geeks for Geeks."
here there are 3 sentences.
1. I Love Coding.
2. Geeks for Geeks helped me in this regard very much.
3. I Love Geeks for Geeks

In python, .sents is used for sentence segmentation which is present inside spacy. The output is given by .sents is a generator and we need to use the list if we want to print them randomly.

Code:




#import spacy library
import spacy
  
#load core english library
nlp = spacy.load("en_core_web_sm")
  
#take unicode string  
#here u stands for unicode
doc = nlp(u"I Love Coding. Geeks for Geeks helped me in this regard very much. I Love Geeks for Geeks.")
#to print sentences
for sent in doc.sents:
  print(sent)

Output:

Now if we try to use doc.sents randomly then what happens:

Code: To overcome this error we first need to convert this generator into a list using list function.




#converting the generator object result in to list
doc1 = list(doc.sents)
  
#Now we can use it randomly as
doc1[1]

Output:




My Personal Notes arrow_drop_up
Recommended Articles
Page :

Start Your Coding Journey Now!