Chunking all proper nouns (tagged with NNP) is a very simple way to perform named entity extraction. A simple grammar that combines all proper nouns into a NAME chunk can be created using the RegexpParser class.
Then, we can test this on the first tagged sentence of treebank_chunk to compare the results with the previous recipe:
Code #1 : Testing it on the first tagged sentence of treebank_chunk
Named Entities : [[('Pierre', 'NNP'), ('Vinken', 'NNP')], [('Nov.', 'NNP')]]
Note : The code above returns all the proper nouns – ‘Pierre’, ‘Vinken’, ‘Nov.’
NAME chunker is a simple usage of the RegexpParser class. All sequences of NNP tagged words are combined into NAME chunks.
PersonChunker class can be used if one only want to chunk the names of people.
Code #2 : PersonChunker class
PersonChunker class checks whether each word is in its names_set (constructed from the names corpus) by iterating over the tagged sentence. It either uses B-PERSON or I-PERSON IOB tags if the current word is in the names_set, depending on whether the previous word was also in the names_set. O IOB tag is assigned to the word that’s not in the names_set argument. IOB tags list is converted to a Tree using
conlltags2tree() after completion.
Code #3 : Using PersonChunker class on the same tagged sentence
Person name : [[('Pierre', 'NNP')]]
- NLP | Swapping Verb Phrases and Noun Cardinals
- NLP | Location Tags Extraction
- Sklearn | Feature Extraction with TF-IDF
- Extraction of Tweets using Tweepy
- Python | Prefix extraction before specific character
- Python | Words extraction from set of characters using dictionary
- Python | Foreground Extraction in an Image using Grabcut Algorithm
- rangev2 - A new version of Python range class
- How Should a Machine Learning Beginner Get Started on Kaggle?
- Python | Binary Group Tuple list elements
- Python | OCR on All the Images present in a Folder Simultaneously
- Python | Group tuple into list based on value
- Python | Find Dissimilar Elements in Tuples
- Python | Extract unique tuples from list, Order Irrespective
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.