# Dynamic Programming | Set 5 (Edit Distance)

Given two strings str1 and str2 and below operations that can performed on str1. Find minimum number of edits (operations) required to convert ‘str1’ into ‘str2’.

1. Insert
2. Remove
3. Replace

All of the above operations are of equal cost.

Examples:

```Input:   str1 = "geek", str2 = "gesek"
Output:  1
We can convert str1 into str2 by inserting a 's'.

Input:   str1 = "cat", str2 = "cut"
Output:  1
We can convert str1 into str2 by replacing 'a' with 'u'.

Input:   str1 = "sunday", str2 = "saturday"
Output:  3
Last three and first characters are same.  We basically
need to convert "un" to "atur".  This can be done using
below three operations.
Replace 'n' with 'r', insert t, insert a```

What are the subproblems in this case?
The idea is process all characters one by one staring from either from left or right sides of both strings.
Let we traverse from right corner, there are two possibilities for every pair of character being traversed.

```m: Length of str1 (first string)
n: Length of str2 (second string)
```
1. If last characters of two strings are same, nothing much to do. Ignore last characters and get count for remaining strings. So we recur for lengths m-1 and n-1.
2. Else (If last characters are not same), we consider all operations on ‘str1’, consider all three operations on last character of first string, recursively compute minimum cost for all three operations and take minimum of three values.
1. Insert: Recur for m and n-1
2. Remove: Recur for m-1 and n
3. Replace: Recur for m-1 and n-1

Below is C++ implementation of above Naive recursive solution.

## C++

```// A Naive recursive C++ program to find minimum number
// operations to convert str1 to str2
#include<bits/stdc++.h>
using namespace std;

// Utility function to find minimum of three numbers
int min(int x, int y, int z)
{
return min(min(x, y), z);
}

int editDist(string str1 , string str2 , int m ,int n)
{
// If first string is empty, the only option is to
// insert all characters of second string into first
if (m == 0) return n;

// If second string is empty, the only option is to
// remove all characters of first string
if (n == 0) return m;

// If last characters of two strings are same, nothing
// much to do. Ignore last characters and get count for
// remaining strings.
if (str1[m-1] == str2[n-1])
return editDist(str1, str2, m-1, n-1);

// If last characters are not same, consider all three
// operations on last character of first string, recursively
// compute minimum cost for all three operations and take
// minimum of three values.
return 1 + min ( editDist(str1,  str2, m, n-1),    // Insert
editDist(str1,  str2, m-1, n),   // Remove
editDist(str1,  str2, m-1, n-1) // Replace
);
}

// Driver program
int main()
{
string str1 = "sunday";
string str2 = "saturday";

cout << editDist( str1 , str2 , str1.length(), str2.length());

return 0;
}
```

## Java

```// A Naive recursive Java program to find minimum number
// operations to convert str1 to str2
class EDIST
{
static int min(int x,int y,int z)
{
if (x<y && x<z) return x;
if (y<x && y<z) return y;
else return z;
}

static int editDist(String str1 , String str2 , int m ,int n)
{
// If first string is empty, the only option is to
// insert all characters of second string into first
if (m == 0) return n;

// If second string is empty, the only option is to
// remove all characters of first string
if (n == 0) return m;

// If last characters of two strings are same, nothing
// much to do. Ignore last characters and get count for
// remaining strings.
if (str1.charAt(m-1) == str2.charAt(n-1))
return editDist(str1, str2, m-1, n-1);

// If last characters are not same, consider all three
// operations on last character of first string, recursively
// compute minimum cost for all three operations and take
// minimum of three values.
return 1 + min ( editDist(str1,  str2, m, n-1),    // Insert
editDist(str1,  str2, m-1, n),   // Remove
editDist(str1,  str2, m-1, n-1) // Replace
);
}

public static void main(String args[])
{
String str1 = "sunday";
String str2 = "saturday";

System.out.println( editDist( str1 , str2 , str1.length(), str2.length()) );
}
}
/*This code is contributed by Rajat Mishra*/```

## Python

```# A Naive recursive Python program to fin minimum number
# operations to convert str1 to str2
def editDistance(str1, str2, m , n):

# If first string is empty, the only option is to
# insert all characters of second string into first
if m==0:
return n

# If second string is empty, the only option is to
# remove all characters of first string
if n==0:
return m

# If last characters of two strings are same, nothing
# much to do. Ignore last characters and get count for
# remaining strings.
if str1[m-1]==str2[n-1]:
return editDistance(str1,str2,m-1,n-1)

# If last characters are not same, consider all three
# operations on last character of first string, recursively
# compute minimum cost for all three operations and take
# minimum of three values.
return 1 + min(editDistance(str1, str2, m, n-1),    # Insert
editDistance(str1, str2, m-1, n),    # Remove
editDistance(str1, str2, m-1, n-1)    # Replace
)

# Driver program to test the above function
str1 = "sunday"
str2 = "saturday"
print editDistance(str1, str2, len(str1), len(str2))

# This code is contributed by Bhavya Jain
```

Output:

`3`

The time complexity of above solution is exponential. In worst case, we may end up doing O(3m) operations. The worst case happens when none of characters of two strings match. Below is a recursive call diagram for worst case.

We can see that many subproblems are solved again and again, for example eD(2,2) is called three times. Since same suproblems are called again, this problem has Overlapping Subprolems property. So Edit Distance problem has both properties (see this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, recomputations of same subproblems can be avoided by constructing a temporary array that stores results of subpriblems.

## C++

```// A Dynamic Programming based C++ program to find minimum
// number operations to convert str1 to str2
#include<bits/stdc++.h>
using namespace std;

// Utility function to find minimum of three numbers
int min(int x, int y, int z)
{
return min(min(x, y), z);
}

int editDistDP(string str1, string str2, int m, int n)
{
// Create a table to store results of subproblems
int dp[m+1][n+1];

// Fill d[][] in bottom up manner
for (int i=0; i<=m; i++)
{
for (int j=0; j<=n; j++)
{
// If first string is empty, only option is to
// isnert all characters of second string
if (i==0)
dp[i][j] = j;  // Min. operations = j

// If second string is empty, only option is to
// remove all characters of second string
else if (j==0)
dp[i][j] = i; // Min. operations = i

// If last characters are same, ignore last char
// and recur for remaining string
else if (str1[i-1] == str2[j-1])
dp[i][j] = dp[i-1][j-1];

// If last character are different, consider all
// possibilities and find minimum
else
dp[i][j] = 1 + min(dp[i][j-1],  // Insert
dp[i-1][j],  // Remove
dp[i-1][j-1]); // Replace
}
}

return dp[m][n];
}

// Driver program
int main()
{
string str1 = "sunday";
string str2 = "saturday";

cout << editDistDP(str1, str2, str1.length(), str2.length());

return 0;
}
```

## Java

```// A Dynamic Programming based Java program to find minimum
// number operations to convert str1 to str2
class EDIST
{
static int min(int x,int y,int z)
{
if (x < y && x <z) return x;
if (y < x && y < z) return y;
else return z;
}

static int editDistDP(String str1, String str2, int m, int n)
{
// Create a table to store results of subproblems
int dp[][] = new int[m+1][n+1];

// Fill d[][] in bottom up manner
for (int i=0; i<=m; i++)
{
for (int j=0; j<=n; j++)
{
// If first string is empty, only option is to
// isnert all characters of second string
if (i==0)
dp[i][j] = j;  // Min. operations = j

// If second string is empty, only option is to
// remove all characters of second string
else if (j==0)
dp[i][j] = i; // Min. operations = i

// If last characters are same, ignore last char
// and recur for remaining string
else if (str1.charAt(i-1) == str2.charAt(j-1))
dp[i][j] = dp[i-1][j-1];

// If last character are different, consider all
// possibilities and find minimum
else
dp[i][j] = 1 + min(dp[i][j-1],  // Insert
dp[i-1][j],  // Remove
dp[i-1][j-1]); // Replace
}
}

return dp[m][n];
}

public static void main(String args[])
{
String str1 = "sunday";
String str2 = "saturday";
System.out.println( editDistDP( str1 , str2 , str1.length(), str2.length()) );
}
}/*This code is contributed by Rajat Mishra*/
```

## Python

```# A Dynamic Programming based Python program for edit
# distance problem
def editDistDP(str1, str2, m, n):
# Create a table to store results of subproblems
dp = [[0 for x in range(n+1)] for x in range(m+1)]

# Fill d[][] in bottom up manner
for i in range(m+1):
for j in range(n+1):

# If first string is empty, only option is to
# isnert all characters of second string
if i == 0:
dp[i][j] = j    # Min. operations = j

# If second string is empty, only option is to
# remove all characters of second string
elif j == 0:
dp[i][j] = i    # Min. operations = i

# If last characters are same, ignore last char
# and recur for remaining string
elif str1[i-1] == str2[j-1]:
dp[i][j] = dp[i-1][j-1]

# If last character are different, consider all
# possibilities and find minimum
else:
dp[i][j] = 1 + min(dp[i][j-1],        # Insert
dp[i-1][j],        # Remove
dp[i-1][j-1])    # Replace

return dp[m][n]

# Driver program
str1 = "sunday"
str2 = "saturday"

print(editDistDP(str1, str2, len(str1), len(str2)))
# This code is contributed by Bhavya Jain
```

Output:

`3`

Time Complexity: O(m x n)
Auxiliary Space: O(m x n)

Applications: There are many practical applications of edit distance algorithm, refer Lucene API for sample. Another example, display all the words in a dictionary that are near proximity to a given word\incorrectly spelled word.

Thanks to Vivek Kumar for suggesting above updates.

# Company Wise Coding Practice    Topic Wise Coding Practice

• recursive one is too slow

• Guest

do this question really require dynamic programming
can’t we do this
“total” will store all min steps;
1.subtract length between the two string. [ total = | s1 | – | s2 | ]

2.replace remaining unmatched letters on every replace do total++;

correct me if i am wrong in any case ðŸ™‚

• Guest

i got that finally ðŸ™‚

• guest

plz check is code is working for all test case where characters can be only small alphabets

#include
#include
#include
int EditDistanceDP(char *X,char *Y)
{
int i;
int alphabet[26]={0};
int y1=strlen(Y);
int x1=strlen(X);
for(i=0;i<y1;i++)
{
alphabet[(int)Y[i]-(int)'a']++;
}
int remember=y1;
for(i=0;i0)
{
remember–;
alphabet[(int)X[i]-(int)’a’]–;
}
}
return remember;
}
int main()
{
char X[]=”sunday”; // vertical
char Y[]=”saturday”; // horizontal

printf(“Minimum edits required to convert %s into %s is %dn”,X, Y, EditDistanceDP(X, Y) );
// printf(“Minimum edits required to convert %s into %s is %d by recursionn”,X, Y, EditDistanceRecursion(X, Y, strlen(X), strlen(Y)));
}

• guest

#include
plz check is code is working for all test case where characters can be only small alphabets

#include

#include

int EditDistanceDP(char *X,char *Y)

{

int i;

int alphabet[26]={0};

int y1=strlen(Y);

printf(“y1===%dn”,y1);

int x1=strlen(X);

printf(“x1===%dn”,x1);

for(i=0;i<y1;i++)

{

alphabet[(int)Y[i]-(int)'a']++;

}

int remember=y1;

for(i=0;i0)

{

remember–;

alphabet[(int)X[i]-(int)’a’]–;

}

}

return remember;
}

int main()
{
char X[]=”sunday”; // vertical
char Y[]=”saturday”; // horizontal

printf(“Minimum edits required to convert %s into %s is %dn”,X, Y, EditDistanceDP(X, Y) );
// printf(“Minimum edits required to convert %s into %s is %d by recursionn”,X, Y, EditDistanceRecursion(X, Y, strlen(X), strlen(Y)));
}

• Emmanuel Livingstone

I think there is a small typo. When considering SUNDAY and SATURDAY with i=2 and j=4, the prefix strings mentioned are SUN and SATU. It should have been SU and SATU. The analysis following the statement considers it as SU and SATU.

Correct me if I’m wrong.

• Larry

Yes. You are right

• typing..

@geeksforgeeks:disqus, is this problem can be solved by considering two strings and finding LCS of them, and then subtract it from length of string of maximum length????

• Vivek

if (X[i]==Y[j])
it should be only
T[i][j]=T[i-1][j-1];

in the above code, why are we considering insertion and deletion
if X[i] equals Y[j] , isn’t it redundant?

• prashant jha

here in naive recursive implementation the complexiy will be 0(3^n)
but in dp there are exactly m*n no of subproblems
here is my simple implementaion using dp
http://ideone.com/XAUoU9

• prashant jha

#include
#include
#define m 20
int arr[m][m];
using namespace std;
int min(int a,int b)
{
return a>b?b:a;
}
int min(int a,int b,int c)
{
return min(min(a,b),c);
}
int fun(char st1[],char st2[],int low1,int low2,int high1,int high2)
{
if(arr[low1][low2]!=-1)
return arr[low1][low2];
if(low1>high1)
return (high2-low2+1);
if(low2>high2)
return (high1-low1+1);
arr[low1+1][low2]=fun(st1,st2,low1+1,low2,high1,high2);
arr[low1][low2+1]=fun(st1,st2,low1,low2+1,high1,high2);
arr[low1+1][low2+1]=fun(st1,st2,low1+1,low2+1,high1,high2);
if(st1[low1]!=st2[low2])
arr[low1][low2]=min(1+arr[low1+1][low2],1+arr[low1][low2+1],1+arr[low1+1][low2+1]);
else
arr[low1][low2]=min(1+arr[low1+1][low2],1+arr[low1][low2+1],arr[low1+1][low2+1]);
return arr[low1][low2];
}
int main()
{
char st1[] = “sunday” ;
char st2[] = “saturday” ;
for(int i=0;i<m;i++)
{
for(int j=0;j<m;j++)
{
arr[i][j]=-1;
}
}
cout<<fun(st1,st2,0,0,strlen(st1)-1,strlen(st2)-1)<<" is minimum possible changes to convert s1 to s2.n";
return 0;
}

• alext

Detailed explanation:
“Alignment” is an important concept in this problem, for eg., “SUNDAY”—>”SATURDAY”, and the alignment should be “S _ _ U N D A Y” with “S A T U R D A Y”, and the Levenshtein distance is 3. See the blanks? Yes, blanks also should be a part of the string, and blanks should also contribute to alignment. SO HERE COMES THE EXPLANATION, for eg., “S A T _”—>”S B _”, that’s E(3, 2), if we align ‘T’ with ‘B’, it goes like E(3, 2)=E(2, 1)+1, easily understood; if we align ‘T’ with ‘_'(from 2nd string), obviously we need to DELETE ‘T’ from the first string, and it’s like E(3, 2)=E(2, 2)+1; if we align ‘_'(from 1st string) with ‘B’, we need to INSERT ‘B’ in the first string, hence E(3,2)=E(3,1)+1.

• zealfire

It says for deletion we need to say t[i][j-1] and for insertion t[i-1][j].how is it so,what i could understand it should be reversed.please comment

• prashantjha

/*

int fun(char st1[],char st2[],int i,int j,int h1,int h2)

{

if((i>h1)&&(j>h2))

return 0;

if(i>h1)

return (h2-j+1);

if(j>h2)

return (h1-i+1);

if(st1[i]==st2[j])

return (fun(st1,st2,i+1,j+1,h1,h2));

else

return min((1+fun(st1,st2,i+1,j,h1,h2)),

(1+fun(st1,st2,i,j+1,h1,h2)),

(1+fun(st1,st2,i+1,j+1,h1,h2)));

}
*/

• prashantjha
• Rushikesh

Hi,

Here is the code that I could come up with to solve this problem:

public class StringTest
{
public static void main(String[] args)
{
String str1 = args[0];
String str2 = args[1];

int num = minOper(str1, str2);
System.out.println(“Minimum Operations = “+num);
}

public static int minOper(String str1, String str2)
{
int len1 = str1.length();
int len2 = str2.length();

int nInsert = 0;
int nDelete = 0;
int nMod = 0;

String small = null;
String big = null;
if(len1 < len2)
{
nInsert = len2 – len1;
small = str1;
big = str2;
}
else
{
nDelete = len1 – len2;
small = str2;
big = str1;
}

int highestMatchingChars = 0;
int matchingChars = 0;
char[] smallChars = new char[small.length()+1];
small.getChars(0, small.length(), smallChars, 0);
for(int i=0;i<=(nInsert+nDelete);i++)
{
matchingChars = 0;
char[] bigChars = new char[small.length()+1];
big.getChars(i, i+small.length(), bigChars, 0);
for(int j=0;j<small.length();j++)
{
if(smallChars[j] == bigChars[j])
{
System.out.println("Small char = "+smallChars[j]+"tBig char = "+bigChars[j]);
matchingChars++;
}
}

if(highestMatchingChars < matchingChars)
highestMatchingChars = matchingChars;
}

nMod = small.length()-highestMatchingChars;
return (nInsert + nDelete + nMod);
}
}

Please comment if anything missing in the algorithm/code here?

• hello

if same cost for R,D,I operation then output is maximum of given two string lengths – length of longest common subsequence of given two strings. i think so… ? ðŸ™‚

• Mukesh M

Not True. Consider Str1 and Str2 being “AEDFHR” and “ABCDGH”. LCS is “ADH” but edit distance is 4. Although you are correct in using LCS but way to find edit distance would be break strings on matching points and then sum up max(str1_1.len,str2_1.len)+max(str1_2.len,str2_2.len)+max(str1_3.len,str2_3.len)…

• me.deeiip

How is it D.P. if recursion is not memoized?

• what’s in the name

Actually in bottom up manner only those table entries are queried which have already been entered. Hence it isn’t called memoization in true terms which is for top down manner.
*(T + (i)*n + (j)) = Minimum(leftCell, topCell, cornerCell);
This line makes entry to the table for future use .

• Shimpu

Can anyone please elaborate clearly the alignment thing and how the code it working??

• Nitesh

The Recursive code checks for left and right even when X[m-1]==Y[n-1] , it should simply call the next corner case instead of left and right , will that miss any special case??

Recursive code :

# include

# include

# include

# include

# include

using namespace std;

string a,b;

int m,n;

int minimum(int a, int b, int c)

{

return(min(min(a,b),c));

}

int edist(int i, int j, int count)

{

if(i==m||j==n)

return count;

if(a[i]==b[j])

{

return(edist(i+1,j+1,count));

}

else

{

int a=edist(i+1,j,count+1);

int b=edist(i+1,j+1,count+1);

int c=edist(i,j+1,count+1);

return(minimum(a,b,c));

}

}

int main()

{

a=”SUNDAY”;

b=”SATURDAY”;

m=a.length();

n=b.length();

printf(“%d”,edist(0,0,0));

}

• anon

This is very bad,lossy description provided here.Its like first written in Chinese and then translated into english.Not expected from the team.There must be some IMAGES showing what you are trying to say.

• Kaidul Islam Sazal

The Dynamic programming portion is buggy.
This is the right implementation
``` int EditDistanceDP(char X[], char Y[], int lenX, int lenY) {```

``` // T[m][n] int T[lenX + 1][lenY + 1]; for(int i = 0; i <= lenX; i++) T[i][0] = i; for(int i = 0; i <= lenY; i++) T[0][i] = i; for(int i = 1; i <= lenX; i++) { for(int j = 1; j <= lenY; j++) { if (X[i - 1] == Y[j - 1]) T[i][j] = T[i - 1][j - 1]; else T[i][j] = Minimum(T[i - 1][j], T[i][j - 1], T[i - 1][j - 1]) + 1; } } return T[lenX][lenY]; } ```

• Chandan Mittal

Please tell what does ‘align’ means in the 3 cases above?

``` ```
/* Short Implementation */

int EditDistanceDP(char X[], char Y[])
{
int lx=strlen(X),ly=strlen(Y);
int edit[lx+1][ly+1];

for(int i=0;i<=lx;++i)
edit[i][0]=i;
for(int i=0;i<=ly;++i)
edit[0][i]=i;
for(int i=1;i<=lx;++i)
for(int j=1;j<=ly;++j)
edit[i][j]=min(edit[i-1][j-1]+!(X[i-1]==Y[j-1]),min(edit[i][j-1],edit[i-1][j])+1);

return edit[lx][ly];
}

``` ```

sorry guys for so many posts ! posting for the first time !

``` ```
/* Short Implementation */

int EditDistanceDP(char X[], char Y[])
{
int lx=strlen(X),ly=strlen(Y);
int edit[lx+1][ly+1];

for(int i=0;i<=lx;++i)
edit[i][0]=i;
for(int i=0;i<=ly;++i)
edit[0][i]=i;
for(int i=1;i<=lx;++i)
for(int j=1;j<=ly;++j)
edit[i][j]=min(edit[i-1][j-1]+!(X[i-1]==Y[j-1]),min(edit[i][j-1],edit[i-1][j])+1);

return edit[lx][ly];
}

``` ```
``` ```
/* Short Implementation */
int EditDistanceDP(char X[], char Y[])
{
int lx=strlen(X),ly=strlen(Y);
int edit[lx+1][ly+1];

for(int i=0;i<=lx;++i)
edit[i][0]=i;
for(int i=0;i<=ly;++i)
edit[0][i]=i;
for(int i=1;i<=lx;++i)
for(int j=1;j<=ly;++j)
edit[i][j]=min(edit[i-1][j-1]+!(X[i-1]==Y[j-1]),min(edit[i][j-1],edit[i-1][j])+1);

return edit[lx][ly];
}
``` ```

int EditDistanceDP(char X[], char Y[])
{
int lx=strlen(X),ly=strlen(Y);
int edit[lx+1][ly+1];

for(int i=0;i<=lx;++i) edit[i][0]=i; for(int i=0;i<=ly;++i) edit[0][i]=i; for(int i=1;i<=lx;++i) for(int j=1;j<=ly;++j) edit[i][j]=min(edit[i-1][j-1]+!(X[i-1]==Y[j-1]),min(edit[i][j-1],edit[i-1][j])+1); return edit[lx][ly]; }

• t_thirupathi

A small correction in the sentence –
“Given strings SUNDAY and SATURDAY. We want to convert SUNDAY into SATURDAY with minimum edits. Let us pick i = 2 and j = 4 i.e. prefix strings are SUN and SATU respectively (assume the strings indices start at 1). The right most characters can be aligned in three different ways.”

Instead of “i = 2 and j = 4”, shouldn’t it be “i = 3 and j = 4”?

• Nagaraju

It is i=2 only and their logic follows this, but mistake here is they wrote “SUN” instead of “SU”

``` ```
/* Paste your code here (You may delete these lines if not writing code) */
``` ```
• rahul23

@venki

In DP implementation:-
// T[i][j-1]
leftCell = *(T + i*n + j-1);
leftCell += EDIT_COST; // deletion

// T[i-1][j]
topCell = *(T + (i-1)*n + j);
topCell += EDIT_COST; // insertion

Deletion should be insertion and insertion shoud be deletion

Like if we have A in X and Y is C
then A->Y
delete will say delete A…we need to find cost for NULL->Y
which will be given by [i-1][j]

and insertion is given by [i][j-1]
as if A->Y
we will insert Y and find cost to A->NULL

[i][j-1]
Plz update it if m ryt,otherwise correct me.w8ing for ur response

• rahul23

@venki Inthe recursive sifinition of fxn
the following line

int corner = EditDistanceRecursion(X, Y, m-1, n-1) + (X[m] != Y[n]);
should be changed to
int corner = EditDistanceRecursion(X, Y, m-1, n-1) + (X[m-1] != Y[n-1]);
For eg. if we have X=”A” and Y=”X”;then min should be 1.
But your function will compare m and n index value(1 and 1 index)as m and n contains 1 which is NULL
and considering these equal and replacement cost is 0 and it calls for m-1 and n-1 which will be 0..so 0+0 will become 0…Kindly update in corner variable m-1 and n-1.

• harshieee

program giving wrong output for
s1 = “hello”
s2 = “hellooo”

output is cuming:
Minimum edits required to convert hello into hellooo is 2
Minimum edits required to convert hello into hellooo is 5 by recursion

• Swapnil R Mehta

There is a problem in “Minimum” function, thus answers are coming different with dp and recursive approach.

``` ```
int Minimum(int a, int b, int c)
{
int min;
if( a < b && a < c ) min = a;
else if( b < a && b < c ) min = b;
else min = c;
return min;
}
``` ```
• Thanks both of you for pointing the error. Code is updated.

• sobhan

#include
using namespace std;

T[i][j]=min(leftcell,min(cornercell,topcell));

• shine

can we think of applying these oprations in certain conditions…like insert or delete can give min cost if l1l2 delete or replace may be beneficial…plz do reply

• Alka

What if we convert “SATURDAY” to “SUNDAY”? Results in both the methods used above are different.

• wgpshashank

Yes , its should be and it will.

• wgpshashank

Yes , it should be and it will.

• sreeram

i think in the base cases E(i,0) it should be like i*EDIT_COST instead of i

• Yes. In the current program we took all edit operations of same cost.

• Manak

Correct me if I am wrong, but can this question be solved by first finding the largest common sub-sequence and then subtracting it from the length of the greater string?

• Venki

No, that will not always lead to optimum alignment.

• Manak

Can you specify an example? I cannot get my head around this.

• Venki

Manak, you can take another example given in the content. Consider the words

exponential – ponil = exent
polynomial – ponil = lyom

But the ED(exponential, polynomial) != ED(exent, lyom), here ED stands for Edit Distance.

Practice with few examples, if still not clear, let me know. I would need some time for detailed explanation.

• zyfo2

Don’t get it. Can you give a more detailed example including how to edit? Thank you.

• zyfo2

get it.
the example is like
abcd
cde
LCS=2 “cd”
but edit distance is 3

• zyfo2

In face if only deletion and insertion are possible. then LCS can be applied here

• Silent

I guess we can do it by LCS..we would have to compare longest common subsequence with both the strings a character at a time.. correct me if i am wrong??

• brahma

could you provide one test case for that..

• bhuvi

The following line should be changed

``` ```
cornerCell += (X[i-1] != Y[j-1]);
``` ```

to

``` ```
cornerCell += (X[i] != Y[j]);
``` ```

since we are at i,j we should be comparing x[i] and y[j]. What say?

• Venki

Thanks for comment. The indexing is not an error. Please read the content. We use table of size m+1 x n+1. The indices i and j and one step ahead of the string location, so we need to subtract 1.

• Saurabh Jain

[sourcecode language="JAVA"]
/* Paste your code here (You may delete these lines if not writing code) */

import java.util.Scanner;

/**
*
* @author saurabh
*/
public class EditDistanceDPP
{
char[] s1,s2;

public EditDistanceDPP()
{
Scanner sc = new Scanner(System.in);
s1 = sc.nextLine().toCharArray();
s2 = sc.nextLine().toCharArray();
System.out.println("Edit distance is : "+editDistance(s1,s2));
}

private int editDistance(char[] st1, char[] st2)
{
int[][] s = new int[s1.length+1][s2.length+1];
for(int i=0; i<=s1.length; i++)
{
for(int j=0; j<=s2.length; j++)
{
if(i==0)
s[i][j]=j;
else if(j==0)
s[i][j]=i;
else
s[i][j] = min(s[i-1][j-1]+(st1[i-1]==st2[j-1]?0:1),s[i-1][j]+1,s[i][j-1]+1);
}
}
return s[s1.length][s2.length];
}

int min(int a, int b, int c)
{
return(a<b?a<c?a:c:b<c?b:c);
}

public static void main(String[] args)
{
EditDistanceDPP edd = new EditDistanceDPP();
}
}

This is a quite simple Dynamic Programming approach with time complexity as O(m*n) and space complexity also as O(m*n)….

• Saurabh Jain

Correct me..if anything is wrong in the above code…thanks….

• SAM

your code is working fine!! did anyone pointed some mistakes in it??

• robin singh

kindly quote some references to this problem so that it becomes more clear.
thankyou

``` ```
/* Paste your code here (You may delete these lines if not writing code) */
``` ```
• Venki

Algorithms by Das Guptha is good reference.

• Jatin

In function display the below changes should be made–>
//(base + r * col)1 should be replaced by *(base + r * col + c)

• Venki

@Jatin, thanks. It was typo during post update. I have updated the post.

• Jatin
``` ```
//(base + r * col)1 should be replaced by *(base + r * col + c)
``` ```
• PsychoCoder

In the documentation of the table inside the program :

leftCell = table[i][j-1] ;
and
topCell = table[i-1][j] ;

It should be,
leftCell = table[i-1][j] ;
and
topCell = table[i][j-1] ;

• Venkatesh
• Ratan

“Given strings SUNDAY and SATURDAY. We want to convert SUNDAY into SATURDAY with minimum edits. Let us pick i = 2 and j = 4 i.e. prefix strings are SUN and SATU respectively”

in this line change i=3 or prefix as ‘SU’.

• Doom

Usually the costs D, I and R are not same. In such case the problem can be represented as an acyclic directed graph (DAG) with weights on each edge, and finding shortest path gives edit distance.
How to construct this graph? could you plz give some basic steps? just the logic.

• Evgenia
• Anonymous
``` ```int EDIT[100][100];
int solve_edit( string a, string b) {
for (int j=0;j<=b.size();j++) {
EDIT[0][j]=j;
}
for (int i=1;i<=a.size();i++) {
EDIT[i][0]=i;
for (int j=1;j<=b.size();j++) {
EDIT[i][j]= min( min( EDIT[i][j-1]+1,EDIT[i-1][j]+1),  EDIT[i-1][j-1]+ (int)(a[i-1]!=b[j-1]));
}
}

return EDIT[a.size()][b.size()];
}
``` ```
• rajcools

in description its written —
Combining all the subproblems minimum cost of aligning prefix strings ending at i and j given by

E(i, j) = min( [E(i-1, j) + D], [E(i, j-1) + I], [E(i-1, j-1) + I if i,j characters are not same] )

in —[E(i-1, j-1) + I if i,j characters are not same] )

shouldnt here be replace(R) instead of Insert(I)

else it would be two operations
[E(i-1, j-1) + I +D … we insert one char from target string and delete from original string

• @rajcools, thanks. It should be replace. I will update.

• iitr.ankur

Instead of Using DAG, can’t we simply define 3 different Edit Costs: Edit_Insert(ex. 1), Edit_Delete(2), Edit_Remove(5) and use these in the 3 cases??