Self Named entity chunker can be trained using the ieer corpus, which stands for Information Extraction: Entity Recognition. The ieer corpus has chunk trees but no part-of-speech tags for the words, so it is a bit tedious job to perform.
Named entity chunk trees can be created from ieer corpus using the
ieer_chunked_sents() functions. This can be used to train the
ClassifierChunker class created in the Classification-based chunking.
Code #1 : ieertree2conlltags()
Code #2 : ieer_chunked_sents()
Using 80 out of 94 sentences for training and the remaining ones for testing.
Code #3 : How the classifier works on the first sentence of the treebank_chunk corpus.
Length of ieer_chunks : 94 parsing : Tree('S', [Tree('LOCATION', [('Pierre', 'NNP'), ('Vinken', 'NNP')]), (', ', ', '), Tree('DURATION', [('61', 'CD'), ('years', 'NNS')]), Tree('MEASURE', [('old', 'JJ')]), (', ', ', '), ('will', 'MD'), ('join', 'VB'), ('the', 'DT'), ('board', 'NN'), ('as', 'IN'), ('a', 'DT'), ('nonexecutive', 'JJ'), ('director', 'NN'), Tree('DATE', [('Nov.', 'NNP'), ('29', 'CD')]), ('.', '.')]) Accuracy : 0.8829018388070625 Precision : 0.4088717454194793 Recall : 0.5053635280095352
How it works ?
The ieer trees generated by ieer_chunked_sents() are not entirely accurate. There are no explicit sentence breaks, so each document is a single tree. Also, the words are not explicitly tagged, it’s guess work using nltk.tag.pos_tag().
- NLP | Training Tagger Based Chunker | Set 2
- NLP | Training Tagger Based Chunker | Set 1
- Python | Named Entity Recognition (NER) using spaCy
- NLP | Extracting Named Entities
- NLP | Training Unigram Tagger
- HTML Cleaning and Entity Conversion | Python
- NLP | Training a tokenizer and filtering stopwords in a sentence
- ML | Training Image Classifier using Tensorflow Object Detection API
- Implementing Artificial Neural Network training process in Python
- How to install Anaconda on windows?
- How to install PIP in Linux?
- How to install PIP on Windows ?
- How to install Python on Windows?
- Createview - Class Based Views Django
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.