Open In App

NLP | Trigrams’n’Tags (TnT) Tagging

TnT Tagger : It is a statistical tagger that works on second-order Markov models.

TnT tagger has different API than the normal taggers. One can explicitly use the train() method after creating it.



Code #1 : Using train() method




from nltk.tag import tnt
from nltk.corpus import treebank
  
# initializing training and testing set    
train_data = treebank.tagged_sents()[:3000]
test_data = treebank.tagged_sents()[3000:]
  
# initializing tagger
tnt_tagging = tnt.TnT()
  
# training
tnt_tagging.train(train_data)
  
# evaluating
a = tnt_tagging.evaluate(test_data)
  
print ("Accuracy of TnT Tagging : ", a)

Output :



Accuracy of TnT Tagging : 0.8756313403842003

Understanding the working of TnT tagger :

Code #2 : Using tagger for unknown words as ‘unk’




from nltk.tag import tnt
from nltk.corpus import treebank
from nltk.tag import DefaultTagger
  
# initializing training and testing set    
train_data = treebank.tagged_sents()[:3000]
test_data = treebank.tagged_sents()[3000:]
  
# initializing tagger
unk = DefaultTagger('NN')
tnt_tagging = tnt.TnT(unk = unk, Trained = True)
  
# training 
tnt_tagging.train(train_data)
  
# evaluating
a = tnt_tagging.evaluate(test_data)
  
print ("Accuracy of TnT Tagging : ", a)

Output :

Accuracy of TnT Tagging : 0.892467083962875

Controlling Beam Search :

Code #3 : Using N = 100




from nltk.tag import tnt
from nltk.corpus import treebank
from nltk.tag import DefaultTagger
  
# initializing training and testing set    
train_data = treebank.tagged_sents()[:3000]
test_data = treebank.tagged_sents()[3000:]
  
# initializing tagger
tnt_tagger = tnt.TnT(N = 100)
  
# training 
tnt_tagging.train(train_data)
  
# evaluating
a = tnt_tagging.evaluate(test_data)
  
print ("Accuracy of TnT Tagging : ", a)

Output :

Accuracy of TnT Tagging : 0.8756313403842003

Article Tags :