Open In App

NLP | Classifier-based tagging

ClassifierBasedPOSTagger class:

Code #1 : Using ClassifierBasedPOSTagger






from nltk.tag.sequential import ClassifierBasedPOSTagger
from nltk.corpus import treebank
  
# initializing training and testing set    
train_data = treebank.tagged_sents()[:3000]
test_data = treebank.tagged_sents()[3000:]
  
tagging = ClassifierBasedPOSTagger(train = train_data)
  
a = tagging.evaluate(test_data)
  
print ("Accuracy : ", a)

Output :

Accuracy : 0.9309734513274336

ClassifierBasedPOSTagger class inherits from ClassifierBasedTagger and only implements a feature_detector() method. All the training and tagging is done in ClassifierBasedTagger.



Code #2 : Using MaxentClassifier




from nltk.classify import MaxentClassifier
from nltk.corpus import treebank
  
# initializing training and testing set    
train_data = treebank.tagged_sents()[:3000]
test_data = treebank.tagged_sents()[3000:]
  
  
tagger = ClassifierBasedPOSTagger(
        train = train_sents, classifier_builder = MaxentClassifier.train)
  
a = tagger.evaluate(test_data)
  
print ("Accuracy : ", a)

Output :

Accuracy : 0.9258363911072739

custom feature detector detecting features
There are two ways to do it:

  1. Subclass ClassifierBasedTagger and implement a feature_detector() method.
  2. Pass a function as the feature_detector keyword argument into ClassifierBasedTagger at initialization.

Code #3 : Custom Feature Detector




from nltk.tag.sequential import ClassifierBasedTagger
from tag_util import unigram_feature_detector
from nltk.corpus import treebank
  
# initializing training and testing set    
train_data = treebank.tagged_sents()[:3000]
test_data = treebank.tagged_sents()[3000:]
  
tag = ClassifierBasedTagger(
        train = train_data, 
        feature_detector = unigram_feature_detector)
  
a = tagger.evaluate(test_data)
  
print ("Accuracy : ", a)

Output :

Accuracy : 0.8733865745737104

Article Tags :