Open In App

How to Calculate the Levenshtein Distance Between Two Strings in Java Using Recursion?

Last Updated : 23 Feb, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

In Java, the Levenshtein Distance Algorithm is a pre-defined method used to measure the similarity between two strings and it can be used to calculate the minimum number of single-character edits (inserts, deletions, or substitutions) required to change one string into another.

Prerequisites:

  • Recursion
  • Dynamic Programming
  • String Manipulation

How Does this Algorithm Work?

First, initialize the 2D array with the size of (m+1) * (n+1) where m and n are the lengths of the two input strings. Check the base cases if any one of the strings is empty then return the length of the other string.

if(len1 != 0 & len2 != 0)  then proceed to next steps

After that initialize the first row and column of the 2D array with values representing the number of operation edits required to transform an empty string to the corresponding prefix of the input string. Measure the distances travel through the characters of both strings.

  • For each pair of the characters str1[i] and str2[j]
  • If the case of str1[i] is equal to str2[j] the operation cost of substitution is 0. else cost is 1.
  • Update 2D Array to be the minimum of the below three operations:
  • Insertion: Calculate the distance between “Java” and “JavaScrip” by inserting the characters.
  • Deletion: Calculate the distance between “Jav” and “JavaScript” by removing the characters.
  • Substitution: Calculate the distance between “Jav” and “JavaScrip” by substituting the characters.

Print the minimum cost operations results then the value represents the Levenshtein distance between the two strings.

Sample Program

Java




// Java Program to Calculate the Levenshtein distance
// Between two Strings in Java Using Recursion
  
public class GfGLevenshteinDistance {
      
    public static int calculateDistance(String str1, String str2) {
        return calculateDistanceRecursive(str1, str1.length(), str2, str2.length());
    }
  
    private static int calculateDistanceRecursive(String str1, int len1, String str2, int len2) {
        // Base cases: if either string is empty,
          // return the length of the other string
        if (len1 == 0) {
            return len2;
        }
        if (len2 == 0) {
            return len1;
        }
  
        // If the last characters of the strings are equal,
          // No operation is required
        if (str1.charAt(len1 - 1) == str2.charAt(len2 - 1)) {
            return calculateDistanceRecursive(str1, len1 - 1, str2, len2 - 1);
        }
  
        // Calculate cost of three possible operations
          // Insertion, Deletion, and Substitution
        int insertionCost = calculateDistanceRecursive(str1, len1, str2, len2 - 1);
        int deletionCost = calculateDistanceRecursive(str1, len1 - 1, str2, len2);
        int substitutionCost = calculateDistanceRecursive(str1, len1 - 1, str2, len2 - 1);
  
        // Return minimum of the three
          // costs plus 1 (for the operation)
        return 1 + Math.min(Math.min(insertionCost, deletionCost), substitutionCost);
    }
  
    public static void main(String[] args) {
        String str1 = "Java";
        String str2 = "JavaScript";
        int distance = calculateDistance(str1, str2);
        System.out.println("Levenshtein distance between \"" + str1 + "\" and \"" + str2 + "\" is: " + distance);
    }
}


Output

Levenshtein distance between "Java" and "JavaScript" is: 6






Explanation of the above Program:

In the above example, the program calculates the Levenshtein distance between two strings:

  • Firstly, Check the base cases If one string is empty, then the distance is equal to the length of the other string.
  • Now, check the last characters of the string. If the last character of the string is equal, then operations are not needed. We recursively call the function with the lengths of both strings decremented by 1.
  • Here we are calculating the three possible cost operations:
  • Insertion: The insertion operation can be used to move to the next character in string2 and keep string1 unchanged.
  • Deletion: The deletion operation can be used to move to the next character in string 1 and keep string 2 unchanged.
  • Substitution: This operation can be used to move the next character in both string1 and string2.
  • After applying the three operations, it returns the minimum of the costs of the three operations plus 1.
  • Now, the calculateDistance() method will be called with strings 1 and 2, and it will recursively calculate the Levenshtein distance between the two strings.


Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads