Open In App

NLP | Combining NGram Taggers

NgramTagger has 3 subclasses

BigramTagger subclass uses previous tag as part of its context
TrigramTagger subclass uses the previous two tags as part of its context.



ngram – It is a subsequence of n items.
Idea of NgramTagger subclasses :

Code #1 : Working of Bigram tagger






# Loading Libraries 
from nltk.tag import DefaultTagger 
from nltk.tag import BigramTagger
  
from nltk.corpus import treebank
  
# initializing training and testing set    
train_data = treebank.tagged_sents()[:3000]
test_data = treebank.tagged_sents()[3000:]
  
# Tagging
tag1 = BigramTagger(train_data)
  
# Evaluation
tag1.evaluate(test_data)

Output :

0.11318799913662854

 
Code #2 : Working of Trigram tagger




# Loading Libraries 
from nltk.tag import DefaultTagger 
from nltk.tag import TrigramTagger
  
from nltk.corpus import treebank
  
# initializing training and testing set    
train_data = treebank.tagged_sents()[:3000]
test_data = treebank.tagged_sents()[3000:]
  
# Tagging
tag1 = TrigramTagger(train_data)
  
# Evaluation
tag1.evaluate(test_data)

Output :

0.06876753723289446

 
Code #3 : Collectively using Unigram, Bigram and Trigram tagger.




# Loading Libraries
   
from nltk.tag import TrigramTagger
from tag_util import backoff_tagger
from nltk.corpus import treebank
  
# initializing training and testing set    
train_data = treebank.tagged_sents()[:3000]
test_data = treebank.tagged_sents()[3000:]
  
backoff = DefaultTagger('NN')
tag = backoff_tagger(train_sents, 
                     [UnigramTagger, BigramTagger, TrigramTagger], 
                     backoff = backoff)
  
tag.evaluate(test_sents)

Output :

0.8806820634578028

How it works ?

Code #4 : Proof




print (tagger._taggers[-1] == backoff)
  
print ("\n", isinstance(tagger._taggers[0], TrigramTagger))
  
print ("\n", isinstance(tagger._taggers[1], BigramTagger))

Output :

True

True

True

Article Tags :