Open In App

Damerau–Levenshtein distance

The Damerau–Levenshtein distance is a measure of the similarity between two strings, which takes into account the number of insertion, deletion, substitution, and transposition operations needed to transform one string into the other. 

It is named after the two mathematicians who separately introduced the idea in the 1960s: Russian Vladimir I. Levenshtein and American computer scientist Frederick J. Damerau.



Definition:

The distance between two strings a and b can be defined by using a function fa, b(i, j) where i and j represent the prefix length of string a and b respectively which can be defined as follows:

Damerau – Levenshtein distance

Algorithms:

The Damerau – Levenshtein distance can be calculated in two ways namely:



1. Optimal String Alignment Distance:

The “Restricted Edit distance” also referred to as the “Optimal String Alignment Distance” calculates how well two strings match one another. 

It is determined by calculating how many single-character alterations (insertions, deletions, or replacements) are necessary to change one string into another. 

Example: The ideal string alignment distance, between the words “kitten” and “sitting” is three because it takes three modifications to change one word into the other: replacing “k” with “s, ” replacing “e” with I, and adding “g” at the end.

Algorithm:

Here is an example of how dynamic programming can be used to determine the ideal string alignment distance between two strings:

Below is the implementation of the above approach.




#include <iostream>
#include <algorithm>
#include <vector>
 
using namespace std;
 
int optimalStringAlignmentDistance(string s1, string s2) {
    // Create a table to store the results of subproblems
    vector<vector<int>> dp(s1.length() + 1, vector<int>(s2.length() + 1));
 
    // Initialize the table
    for (int i = 0; i <= s1.length(); i++) {
        dp[i][0] = i;
    }
    for (int j = 0; j <= s2.length(); j++) {
        dp[0][j] = j;
    }
 
    // Populate the table using dynamic programming
    for (int i = 1; i <= s1.length(); i++) {
        for (int j = 1; j <= s2.length(); j++) {
            if (s1[i-1] == s2[j-1]) {
                dp[i][j] = dp[i-1][j-1];
            } else {
                dp[i][j] = 1 + min(dp[i-1][j], min(dp[i][j-1], dp[i-1][j-1]));
            }
        }
    }
 
    // Return the edit distance
    return dp[s1.length()][s2.length()];
}
 
int main() {
    cout << optimalStringAlignmentDistance("geeks", "forgeeks") << endl;
    return 0;
}
// This code is contributed by Vikram_Shirsat




// Java code to calculate optimal string alignment distance
public class Main {
    public static int optimalStringAlignmentDistance(String s1, String s2) {
        // Create a table to store the results of subproblems
        int[][] dp = new int[s1.length()+1][s2.length()+1];
 
        // Initialize the table
        for (int i = 0; i <= s1.length(); i++) {
            dp[i][0] = i;
        }
        for (int j = 0; j <= s2.length(); j++) {
            dp[0][j] = j;
        }
 
        // Populate the table using dynamic programming
        for (int i = 1; i <= s1.length(); i++) {
            for (int j = 1; j <= s2.length(); j++) {
                if (s1.charAt(i-1) == s2.charAt(j-1)) {
                    dp[i][j] = dp[i-1][j-1];
                } else {
                    dp[i][j] = 1 + Math.min(Math.min(dp[i-1][j], dp[i][j-1]), dp[i-1][j-1]);
                }
            }
        }
 
        // Return the edit distance
        return dp[s1.length()][s2.length()];
    }
 
    public static void main(String[] args) {
        System.out.println(optimalStringAlignmentDistance("geeks", "forgeeks"));
    }
}
 
//This code is contributed by shivamsharma215




def optimal_string_alignment_distance(s1, s2):
    # Create a table to store the results of subproblems
    dp = [[0 for j in range(len(s2)+1)] for i in range(len(s1)+1)]
     
    # Initialize the table
    for i in range(len(s1)+1):
        dp[i][0] = i
    for j in range(len(s2)+1):
        dp[0][j] = j
 
    # Populate the table using dynamic programming
    for i in range(1, len(s1)+1):
        for j in range(1, len(s2)+1):
            if s1[i-1] == s2[j-1]:
                dp[i][j] = dp[i-1][j-1]
            else:
                dp[i][j] = 1 + min(dp[i-1][j], dp[i][j-1], dp[i-1][j-1])
 
    # Return the edit distance
    return dp[len(s1)][len(s2)]
 
print(optimal_string_alignment_distance("geeks", "forgeeks"))




// C# code to calculate optimal string alignment distance
using System;
 
class MainClass {
    static int OptimalStringAlignmentDistance(string s1,
                                              string s2)
    {
        // Create a table to store the results of
        // subproblems
        int[, ] dp = new int[s1.Length + 1, s2.Length + 1];
        // Initialize the table
        for (int i = 0; i <= s1.Length; i++) {
            dp[i, 0] = i;
        }
        for (int j = 0; j <= s2.Length; j++) {
            dp[0, j] = j;
        }
 
        // Populate the table using dynamic programming
        for (int i = 1; i <= s1.Length; i++) {
            for (int j = 1; j <= s2.Length; j++) {
                if (s1[i - 1] == s2[j - 1]) {
                    dp[i, j] = dp[i - 1, j - 1];
                }
                else {
                    dp[i, j]= 1+ Math.Min(dp[i - 1, j],Math.Min(dp[i, j - 1],dp[i - 1, j - 1]));
                }
            }
        }
 
        // Return the edit distance
        return dp[s1.Length, s2.Length];
    }
 
    public static void Main()
    {
        Console.WriteLine(OptimalStringAlignmentDistance(
            "geeks", "forgeeks"));
    }
}
// This code is contributed by japmeet01




// Javascript function to calculate optimal string alignment distance
function optimalStringAlignmentDistance(s1, s2)
{
 
// Create a table to store the results of subproblems
let dp = new Array(s1.length + 1).fill(0)
.map(() => new Array(s2.length + 1).fill(0));
 
 
// Initialize the table
for (let i = 0; i <= s1.length; i++) {
    dp[i][0] = i;
}
for (let j = 0; j <= s2.length; j++) {
    dp[0][j] = j;
}
 
// Populate the table using dynamic programming
for (let i = 1; i <= s1.length; i++) {
    for (let j = 1; j <= s2.length; j++) {
        if (s1[i - 1] === s2[j - 1]) {
            dp[i][j] = dp[i - 1][j - 1];
        }
        else {
            dp[i][j] = 1 + Math.min(dp[i - 1][j], dp[i][j - 1], dp[i - 1][j - 1]);
        }
    }
}
 
// Return the edit distance
return dp[s1.length][s2.length];
}
 
console.log(optimalStringAlignmentDistance("geeks", "forgeeks"));
 
// This code is contributed by lokeshpotta20.

Output
3

2. Damerau – Levenshtein Distance with Adjacent Transpositions:

The Levenshtein distance algorithm can be modified to allow for the option of transposing (swapping) two consecutive characters in the input strings. This modification is known as Levenshtein distance with transpositions. This variant is frequently employed in situations where it makes more sense to treat transpositions as a single edit rather than as two independent edits (a deletion and an insertion), as is the case with the traditional Levenshtein distance. 

Example: The distance between the words “flom” and “molf, ” with adjacent transpositions, is 1, since just one transposition of the “m” and “f” is necessary to change one word into the other.

Algorithm:

A type of string metric used to compare two strings is called distance with contiguous transpositions. The T-distance or T-transposition distance are other names for it. Calculating the least amount of additions, subtractions, and substitutions required to change one string into another is involved in this:

You can use the following calculation to determine how many nearby transpositions are required:

T = (D – S – I) / 2

Where:

  • T – is the required number of subsequent transpositions.
  • D – is the space between the two strings that have been edited.
  • S – is the number of substitutions required to change one string into the other, whereas I is the number of insertions required to do so.

The T-transposition distance can then be obtained by multiplying the required number of adjacent transpositions by the edit distance.

Application:

This has a variety of uses in areas like:


Article Tags :