Related Articles

Related Articles

NLP | Classifier-based tagging
  • Last Updated : 16 Dec, 2019

ClassifierBasedPOSTagger class:

  • It is a subclass of ClassifierBasedTagger that uses classification technique to do part-of-speech tagging.
  • From the words, features are extracted and then passed to an internal classifier.
  • It classifies the features and returns a label i.e. a part-of-speech tag.
  • The feature detector finds multiple length suffixes, does some regular expression matching, and looks at the unigram, bigram, and trigram history to produce a fairly complete set of features for each word

Code #1 : Using ClassifierBasedPOSTagger

filter_none

edit
close

play_arrow

link
brightness_4
code

from nltk.tag.sequential import ClassifierBasedPOSTagger
from nltk.corpus import treebank
  
# initializing training and testing set    
train_data = treebank.tagged_sents()[:3000]
test_data = treebank.tagged_sents()[3000:]
  
tagging = ClassifierBasedPOSTagger(train = train_data)
  
a = tagging.evaluate(test_data)
  
print ("Accuracy : ", a)

chevron_right


Output :

Accuracy : 0.9309734513274336

ClassifierBasedPOSTagger class inherits from ClassifierBasedTagger and only implements a feature_detector() method. All the training and tagging is done in ClassifierBasedTagger.

Code #2 : Using MaxentClassifier



filter_none

edit
close

play_arrow

link
brightness_4
code

from nltk.classify import MaxentClassifier
from nltk.corpus import treebank
  
# initializing training and testing set    
train_data = treebank.tagged_sents()[:3000]
test_data = treebank.tagged_sents()[3000:]
  
  
tagger = ClassifierBasedPOSTagger(
        train = train_sents, classifier_builder = MaxentClassifier.train)
  
a = tagger.evaluate(test_data)
  
print ("Accuracy : ", a)

chevron_right


Output :

Accuracy : 0.9258363911072739

custom feature detector detecting features
There are two ways to do it:

  1. Subclass ClassifierBasedTagger and implement a feature_detector() method.
  2. Pass a function as the feature_detector keyword argument into ClassifierBasedTagger at initialization.

Code #3 : Custom Feature Detector

filter_none

edit
close

play_arrow

link
brightness_4
code

from nltk.tag.sequential import ClassifierBasedTagger
from tag_util import unigram_feature_detector
from nltk.corpus import treebank
  
# initializing training and testing set    
train_data = treebank.tagged_sents()[:3000]
test_data = treebank.tagged_sents()[3000:]
  
tag = ClassifierBasedTagger(
        train = train_data, 
        feature_detector = unigram_feature_detector)
  
a = tagger.evaluate(test_data)
  
print ("Accuracy : ", a)

chevron_right


Output :

Accuracy : 0.8733865745737104

Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.

To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course.




My Personal Notes arrow_drop_up
Recommended Articles
Page :