NLP | Leacock Chordorow (LCH) and Path similarity for Synset

Path-based Similarity: It is a similarity measure that finds the distance that is the length of the shortest path between two synsets.

Leacock Chordorow (LCH) : It is a similarity measure which is an extended version of Path-based similarity as it incorporates the depth of the taxonomy. Therefore, it is the negative log of the shortest path (spath) between two concepts (synset_1 and synset_2) divided by twice the total depth of the taxonomy (D) as defined in fig below.

Code #1 : Introducing Synsets.



filter_none

edit
close

play_arrow

link
brightness_4
code

from nltk.corpus import wordnet 
  
syn1 = wordnet.synsets('hello')[0
syn2 = wordnet.synsets('selling')[0
  
print ("hello name : ", syn1.name()) 
print ("selling name : ", syn2.name()) 

chevron_right


Output :

hello name :   hello.n.01
selling name :   selling.n.01

 
Code #2 : Path Similarity

filter_none

edit
close

play_arrow

link
brightness_4
code

syn1.path_similarity(syn2) 

chevron_right


Output :

0.08333333333333333

 
Code #3 : Leacock Chordorow (LCH) Similarity

filter_none

edit
close

play_arrow

link
brightness_4
code

syn1.lch_similarity(syn2) 

chevron_right


Output :

1.1526795099383855


My Personal Notes arrow_drop_up

Aspire to Inspire before I expire

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.