NgramTagger has 3 subclasses
BigramTagger subclass uses previous tag as part of its context
TrigramTagger subclass uses the previous two tags as part of its context.
ngram – It is a subsequence of n items.
Idea of NgramTagger subclasses :
- By looking at the previous words and P-O-S tags, part-of-speech tag for the current word can be guessed.
- Each tagger maintains a context dictionary (ContextTagger parent class is used to implement it).
- This dictionary is used to guess that tag based on the context.
- The context is some number of previous tagged words in the case of NgramTagger subclasses.
Code #1 : Working of Bigram tagger
Code #2 : Working of Trigram tagger
Code #3 : Collectively using Unigram, Bigram and Trigram tagger.
How it works ?
- The backoff_tagger function creates an instance of each tagger class.
- It gives previous tagger and train_sents as a backoff.
- The order of tagger classes is important: In the code above the first class is UnigramTagger and hence, it will be trained first and given the initial backoff tagger (the DefaultTagger).
- This tagger then becomes the backoff tagger for the next tagger class.
- Final tagger returned will be an instance of the last tagger class – TrigramTagger.
Code #4 : Proof
True True True
- NLP | Backoff Tagging to combine taggers
- Python | Combining two sorted lists
- Combining multiple columns in Pandas groupby with dictionary
- Python | Combining values from dictionary of list
- Python | Combining tuples in list of tuples
- NLP | Classifier-based Chunking | Set 2
- Processing text using NLP | Basics
- Readability Index in Python(NLP)
- Feature Extraction Techniques - NLP
- Python | NLP analysis of Restaurant reviews
- Applying Multinomial Naive Bayes to NLP Problems
- NLP | Chunking and chinking with RegEx
- NLP | Training Unigram Tagger
- NLP | Synsets for a word in WordNet
- NLP | Part of Speech - Default Tagging
- NLP | Word Collocations
- NLP | WuPalmer - WordNet Similarity
- NLP | Training a tokenizer and filtering stopwords in a sentence
- NLP | How tokenizing text, sentence, words works
- NLP | Splitting and Merging Chunks
Improved By : shubham_singh