NLP | Combining NGram Taggers
NgramTagger has 3 subclasses
BigramTagger subclass uses previous tag as part of its context
TrigramTagger subclass uses the previous two tags as part of its context.
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning - Basic Level Course
ngram – It is a subsequence of n items.
Idea of NgramTagger subclasses :
- By looking at the previous words and P-O-S tags, part-of-speech tag for the current word can be guessed.
- Each tagger maintains a context dictionary (ContextTagger parent class is used to implement it).
- This dictionary is used to guess that tag based on the context.
- The context is some number of previous tagged words in the case of NgramTagger subclasses.
Code #1 : Working of Bigram tagger
Code #2 : Working of Trigram tagger
Code #3 : Collectively using Unigram, Bigram and Trigram tagger.
How it works ?
- The backoff_tagger function creates an instance of each tagger class.
- It gives previous tagger and train_sents as a backoff.
- The order of tagger classes is important: In the code above the first class is UnigramTagger and hence, it will be trained first and given the initial backoff tagger (the DefaultTagger).
- This tagger then becomes the backoff tagger for the next tagger class.
- Final tagger returned will be an instance of the last tagger class – TrigramTagger.
Code #4 : Proof
True True True