How to get synonyms/antonyms from NLTK WordNet in Python?

WordNet is a large lexical database of English. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. Synsets are interlinked by means of conceptual-semantic and lexical relations.
WordNet’s structure makes it a useful tool for computational linguistics and natural language processing.

WordNet superficially resembles a thesaurus, in that it groups words together based on their meanings. However, there are some important distinctions.

  • First, WordNet interlinks not just word forms—strings of letters—but specific senses of words. As a result, words that are found in close proximity to one another in the network are semantically disambiguated.
  • Second, WordNet labels the semantic relations among words, whereas the groupings of words in a thesaurus does not follow any explicit pattern other than meaning similarity.

# First, you're going to need to import wordnet:
from nltk.corpus import wordnet

# Then, we're going to use the term "program" to find synsets like so:
syns = wordnet.synsets("program")

# An example of a synset:
print(syns[0].name())

# Just the word:
print(syns[0].lemmas()[0].name())

# Definition of that first synset:
print(syns[0].definition())

# Examples of the word in use in sentences:
print(syns[0].examples())

The output will look like:
plan.n.01
plan
a series of steps to be carried out or goals to be accomplished
[‘they drew up a six-step plan’, ‘they discussed plans for a new bond issue’]

Next, how might we discern synonyms and antonyms to a word? The lemmas will be synonyms, and then you can use .antonyms to find the antonyms to the lemmas. As such, we can populate some lists like:

import nltk
from nltk.corpus import wordnet
synonyms = []
antonyms = []

for syn in wordnet.synsets("good"):
    for l in syn.lemmas():
        synonyms.append(l.name())
        if l.antonyms():
            antonyms.append(l.antonyms()[0].name())

print(set(synonyms))
print(set(antonyms))

The output will be two sets of synonyms and antonyms
{‘beneficial’, ‘just’, ‘upright’, ‘thoroughly’, ‘in_force’, ‘well’, ‘skilful’, ‘skillful’, ‘sound’, ‘unspoiled’, ‘expert’, ‘proficient’, ‘in_effect’, ‘honorable’, ‘adept’, ‘secure’, ‘commodity’, ‘estimable’, ‘soundly’, ‘right’, ‘respectable’, ‘good’, ‘serious’, ‘ripe’, ‘salutary’, ‘dear’, ‘practiced’, ‘goodness’, ‘safe’, ‘effective’, ‘unspoilt’, ‘dependable’, ‘undecomposed’, ‘honest’, ‘full’, ‘near’, ‘trade_good’} {‘evil’, ‘evilness’, ‘bad’, ‘badness’, ‘ill’}

Now , let’s compare the similarity index of any two words

import nltk
from nltk.corpus import wordnet
# Let's compare the noun of "ship" and "boat:"

w1 = wordnet.synset('run.v.01') # v here denotes the tag verb
w2 = wordnet.synset('sprint.v.01')
print(w1.wup_similarity(w2))

Output:
0.857142857143

w1 = wordnet.synset('ship.n.01')
w2 = wordnet.synset('boat.n.01') # n denotes noun
print(w1.wup_similarity(w2))

Output:
0.9090909090909091

This article is contributed by Pratima Upadhyay. If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.

GATE CS Corner    Company Wise Coding Practice

Recommended Posts:







Writing code in comment? Please use ide.geeksforgeeks.org, generate link and share the link here.