Whats is Part-of-speech (POS) tagging ?
It is a process of converting a sentence to forms – list of words, list of tuples (where each tuple is having a form (word, tag)). The tag in case of is a part-of-speech tag, and signifies whether the word is a noun, adjective, verb, and so on.
Default tagging is a basic step for the part-of-speech tagging. It is performed using the DefaultTagger class. The
DefaultTagger class takes ‘tag’ as a single argument. NN is the tag for a singular noun.
DefaultTagger is most useful when it gets to work with most common part-of-speech tag. that’s why a noun tag is recommended.
Code #1 : How it works ?
[('Hello', 'NN'), ('Geeks', 'NN')]
Each tagger has a
tag() method that takes a list of tokens (usually list of words produced by a word tokenizer), where each token is a single word.
tag() returns a list of tagged tokens – a tuple of (word, tag).
How DefaultTagger works ?
It is a subclass of
SequentialBackoffTagger and implements the
choose_tag() method, having three arguments.
- list of tokens
- index of the current token, to choose the tag.
- list of the previous tags
Code #2 : Tagging Sentences
[[('welcome', 'NN'), ('to', 'NN'), ('.', 'NN')], [('Geeks', 'NN'), ('for', 'NN'), ('Geeks', 'NN')]]
Note: Every tag in the list of tagged sentences (in the above code) is NN as we have used
Code #3 : Illustrating how to untag.
['Geeks', 'for', 'Geeks']
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning – Basic Level Course