Dynamic Programming | Set 4 (Longest Common Subsequence)

We have discussed Overlapping Subproblems and Optimal Substructure properties in Set 1 and Set 2 respectively. We also discussed one example problem in Set 3. Let us discuss Longest Common Subsequence (LCS) problem as one more example problem that can be solved using Dynamic Programming.

LCS Problem Statement: Given two sequences, find the length of longest subsequence present in both of them. A subsequence is a sequence that appears in the same relative order, but not necessarily contiguous. For example, “abc”, “abg”, “bdf”, “aeg”, ‘”acefg”, .. etc are subsequences of “abcdefg”. So a string of length n has 2^n different possible subsequences.

It is a classic computer science problem, the basis of diff (a file comparison program that outputs the differences between two files), and has applications in bioinformatics.

Examples:
LCS for input Sequences “ABCDGH” and “AEDFHR” is “ADH” of length 3.
LCS for input Sequences “AGGTAB” and “GXTXAYB” is “GTAB” of length 4.

The naive solution for this problem is to generate all subsequences of both given sequences and find the longest matching subsequence. This solution is exponential in term of time complexity. Let us see how this problem possesses both important properties of a Dynamic Programming (DP) Problem.

1) Optimal Substructure:
Let the input sequences be X[0..m-1] and Y[0..n-1] of lengths m and n respectively. And let L(X[0..m-1], Y[0..n-1]) be the length of LCS of the two sequences X and Y. Following is the recursive definition of L(X[0..m-1], Y[0..n-1]).

If last characters of both sequences match (or X[m-1] == Y[n-1]) then
L(X[0..m-1], Y[0..n-1]) = 1 + L(X[0..m-2], Y[0..n-2])

If last characters of both sequences do not match (or X[m-1] != Y[n-1]) then
L(X[0..m-1], Y[0..n-1]) = MAX ( L(X[0..m-2], Y[0..n-1]), L(X[0..m-1], Y[0..n-2])

Examples:
1) Consider the input strings “AGGTAB” and “GXTXAYB”. Last characters match for the strings. So length of LCS can be written as:
L(“AGGTAB”, “GXTXAYB”) = 1 + L(“AGGTA”, “GXTXAY”)

2) Consider the input strings “ABCDGH” and “AEDFHR. Last characters do not match for the strings. So length of LCS can be written as:
L(“ABCDGH”, “AEDFHR”) = MAX ( L(“ABCDG”, “AEDFHR”), L(“ABCDGH”, “AEDFH”) )

So the LCS problem has optimal substructure property as the main problem can be solved using solutions to subproblems.

2) Overlapping Subproblems:
Following is simple recursive implementation of the LCS problem. The implementation simply follows the recursive structure mentioned above.

/* A Naive recursive implementation of LCS problem */
#include<stdio.h>
#include<stdlib.h>

int max(int a, int b);

/* Returns length of LCS for X[0..m-1], Y[0..n-1] */
int lcs( char *X, char *Y, int m, int n )
{
   if (m == 0 || n == 0)
     return 0;
   if (X[m-1] == Y[n-1])
     return 1 + lcs(X, Y, m-1, n-1);
   else
     return max(lcs(X, Y, m, n-1), lcs(X, Y, m-1, n));
}

/* Utility function to get max of 2 integers */
int max(int a, int b)
{
    return (a > b)? a : b;
}

/* Driver program to test above function */
int main()
{
  char X[] = "AGGTAB";
  char Y[] = "GXTXAYB";

  int m = strlen(X);
  int n = strlen(Y);

  printf("Length of LCS is %d\n", lcs( X, Y, m, n ) );

  getchar();
  return 0;
}

Time complexity of the above naive recursive approach is O(2^n) in worst case and worst case happens when all characters of X and Y mismatch i.e., length of LCS is 0.
Considering the above implementation, following is a partial recursion tree for input strings “AXYT” and “AYZX”

                         lcs("AXYT", "AYZX")
                       /                 \
         lcs("AXY", "AYZX")            lcs("AXYT", "AYZ")
         /            \                  /               \
lcs("AX", "AYZX") lcs("AXY", "AYZ")   lcs("AXY", "AYZ") lcs("AXYT", "AY")

In the above partial recursion tree, lcs(“AXY”, “AYZ”) is being solved twice. If we draw the complete recursion tree, then we can see that there are many subproblems which are solved again and again. So this problem has Overlapping Substructure property and recomputation of same subproblems can be avoided by either using Memoization or Tabulation. Following is a tabulated implementation for the LCS problem.

/* Dynamic Programming implementation of LCS problem */
#include<stdio.h>
#include<stdlib.h>
 
int max(int a, int b);
 
/* Returns length of LCS for X[0..m-1], Y[0..n-1] */
int lcs( char *X, char *Y, int m, int n )
{
   int L[m+1][n+1];
   int i, j;
 
   /* Following steps build L[m+1][n+1] in bottom up fashion. Note 
      that L[i][j] contains length of LCS of X[0..i-1] and Y[0..j-1] */
   for (i=0; i<=m; i++)
   {
     for (j=0; j<=n; j++)
     {
       if (i == 0 || j == 0)
         L[i][j] = 0;
 
       else if (X[i-1] == Y[j-1])
         L[i][j] = L[i-1][j-1] + 1;
 
       else
         L[i][j] = max(L[i-1][j], L[i][j-1]);
     }
   }
   
   /* L[m][n] contains length of LCS for X[0..n-1] and Y[0..m-1] */
   return L[m][n];
}
 
/* Utility function to get max of 2 integers */
int max(int a, int b)
{
    return (a > b)? a : b;
}
 
/* Driver program to test above function */
int main()
{
  char X[] = "AGGTAB";
  char Y[] = "GXTXAYB";
 
  int m = strlen(X);
  int n = strlen(Y);
 
  printf("Length of LCS is %d\n", lcs( X, Y, m, n ) );
 
  getchar();
  return 0;
}

Time Complexity of the above implementation is O(mn) which is much better than the worst case time complexity of Naive Recursive implementation.

The above algorithm/code returns only length of LCS. Please see the following post for printing the LCS.
Printing Longest Common Subsequence

Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.

References:
http://www.youtube.com/watch?v=V5hZoJ6uK-s
http://www.algorithmist.com/index.php/Longest_Common_Subsequence
http://www.ics.uci.edu/~eppstein/161/960229.html
http://en.wikipedia.org/wiki/Longest_common_subsequence_problem





  • Ankit Jain

    int lcs (char str1[],char str2[],int len1,int len2)
    {
    int M[len1+1][len2+1],i,j,flag[len1+1][len2+1],a=0;
    for(i=0;i<len1+1;i++)
    {
    M[i][0]=0;
    flag[i][0]=2;
    }
    for(j=0;j<len2+1;j++)
    {
    M[0][j]=0;
    flag[0][j]=2;
    }
    for(i=1;i<len1+1;i++)
    {
    for(j=1;j=M[i][j-1])
    {
    M[i][j]=M[i-1][j];
    flag[i][j]=1;
    }
    else
    {
    M[i][j]=M[i][j-1];
    flag[i][j]=-1;
    }
    }
    }

    i=len1;
    j=len2;

    while(1)
    {

    if(flag[i][j]==3)
    {
    printf(“%c”,str1[i-1]);
    i-=1;j-=1;
    a++;
    }
    else if(flag[i][j]==1)
    {
    i-=1;
    }
    else if(flag[i][j]==-1)
    {
    j-=1;
    }
    else
    {
    break;
    }

    }
    return a;
    }

  • Joel

    Here’s an implementation although not memoized that returns the sequence itself:

    struct pair {

    int start;

    int end;

    };

    struct char_list {

    char c;

    struct char_list *next_char;

    };

    int find_len(struct char_list *list) {

    int n = 0;

    while(list) {

    n++;

    list = list->next_char;

    }

    return n;

    }

    struct char_list *lcs(char *str1, char *str2, struct pair p1, struct pair p2, struct char_list *clist)

    {

    struct char_list *tmp;

    struct char_list *option1, *option2;

    if(p1.end < p1.start || p2.end c = str1[p1.end];

    tmp->next_char = clist;

    p1.end–;

    p2.end–;

    return lcs(str1, str2, p1, p2, tmp);

    } else {

    p2.end–;

    option1 = lcs(str1, str2, p1, p2, clist);

    p2.end++;

    p1.end–;

    option2 = lcs(str1, str2, p1, p2, clist);

    p1.end++;

    /* TODO: Fix memory leaks */

    if(find_len(option1) < find_len(option2))

    return option2;

    else

    return option1;

    }

    }

  • wgpshashank

    int LCS_Memoization (int i, int j) //0,0 starting point
    {
    if (L[i,j] < 0) {
    if (A[i] == '' || B[j] == '') L[i,j] = 0;
    else if (A[i] == B[j]) L[i,j] = 1 + LCS_Memoization (i+1, j+1);
    else L[i,j] = max(LCS_Memoization (i+1, j), LCS_Memoization (i, j+1));
    }
    return L[i,j];
    }

  • prashnat

    /*

    int fun(int** arr,char st1[],char st2[],int i,int j)

    {

    if((i>=strlen(st1))||(j>=strlen(st2)))

    return 0;

    if(arr[i][j]!=-1)

    return arr[i][j];

    if(st1[i]==st2[j])

    {

    arr[i][j]=1+fun(arr,st1,st2,i+1,j+1);

    return arr[i][j];

    }

    else

    {

    arr[i][j]=max(fun(arr,st1,st2,i+1,j),fun(arr,st1,st2,i,j+1));

    return arr[i][j];

    }

    }

    */

  • prashnat
  • Code_Addict

    Java version for above LCS problem (Recursion + DP):
    http://ideone.com/fsgBNB

    • rohtakdev

      Doesn’t print correct sequence.
      try with AGGTAB and GXTXAYB

  • tina

    How can we di backtracking to get the best optimal path?

    • sr

      fuck off..

  • Aritra

    i have used mergesort for this .. i think it’ll take O(nlogn) + O(n).. If i am wrong please comment.. thank you

     
    /*Aritra Das*/
    
    
    #include<stdio.h>
    #include<conio.h>
    
    void Mergesort(char[],int,int);
    void Merge(char[],int,int,int);
    
    void main()
    {
               int n,i,count=0,k;
               char a[100];
                   
                   printf("\nEnter 1st sequence length: ");
                   scanf("%d",&n);
                   printf("\nEnter the sequence : ");
                 
                   for(i=0;i<n;i++)
                   {
                                  a[i]=getche();
    							   
    							  							                  
    							   	 }
                    printf("\n\nEnter 2st sequence length: ");
                   scanf("%d",&n);
                   printf("\nEnter the sequence : ");
                 
                   for(k=i;k<n+i;k++)
                   {
                                  a[k]=getche();
    							  // scanf("%c",&a[i]); 
    							  fflush(stdin);
    							                  
    							   	 }
             
    		   printf("\n%d %d\n",n,i);
               
                      Mergesort(a,0,(n+i)-1);       
                     
                   		  for(k=0;k<(n+i);k++)
    				  {
    				  	
    					  if(a[k]==a[k+1])
    				  	{
    				  	
    						  count++;
    					}
    				  	
    				  }     
    				  
    				  printf("\n\nThe Longest Common subsequence is of length %d",count);        
                                               
         }
    
    void Mergesort(char a[],int low,int high)
    {
                       int mid;
                       if(low<high)
                       {
                                   mid=(low+high)/2;
                                   Mergesort(a,low,mid);
                                   Mergesort(a,mid+1,high);
                                   Merge(a,low,mid,high);
                                   }     
         }
         
    void Merge(char a[],int low,int mid,int high)
    {
                   int b[20],l1,r1,i;
                   
                   l1=low;
                   r1=mid+1;
                   i=low;
                   
                   while((l1<=mid) && (r1<=high))
                   {
                                   if(a[l1]<a[r1])
                                   {
                                         b[i]=a[l1];
                                         l1++;
                                         i++;          
                                                   }
                                   else
                                   {
                                       b[i]=a[r1];
                                       r1++;
                                       i++;
                                       }                
                                   } 
                       while(l1<=mid)
                       {
                                     b[i]=a[l1];
                                     l1++;
                                     i++;
                                     }   
                      
                        while(r1<=high)
                        {
                                       b[i]=a[r1];
                                       r1++;
                                       i++;
                                       }   
                                                          
                      for(i=low;i<=high;i++)
                                a[i]=b[i];                                                     
         }     
     
    • Aritra

      sorry for posting this .. and couldn’t find the way to delete or modify it .. it’s wrong .. and i am working on this

       
      /* Paste your code here (You may delete these lines if not writing code) */
       
    • Bharathkumar Hegde

      consider this input

      Enter 1st sequence length: 2

      Enter the sequence : aa

      Enter 2st sequence length: 2

      Enter the sequence : bb

      actual ans is 0 but your solution gives 2 as answer.

      your solution does not discriminate between characters from same string and characters from different string

  • AMIT

    Shouldn’t it be
    if(x[i-1]==y[i-1])
    l[i][j]=max(l[i-1][j],l[i][j-1],l[i-1][j-i]+1);
    instead of
    L[i][j] = L[i-1][j-1] + 1;
    Will that make any difference?

  • goyalmnl

    We can also do this using set3 approach. let arr[i] be the LCS of the sequence string1[i] and string2[n-1],including string1[i]. Also we store the position of string1[i] in string2 at each arr[i]. Now, for any j>i, first check the position of string1[j]>string1[i] in string2 and from these pick the one with maximum length.
    Finally, the one with the largest length (largest value of arr)is the ans.
    Please correct me if i am wrong.

  • Prateek Sharma

    Python code with O(n) time complexity without using dynamic programming approach.It generates all longest common subsequence if present b/w two strings.
    Approach-
    1)Compute all substrings of both strings using normal substring generating algo(NO ned to use recursion here).Each substring has the same order of elements as present in original string.
    2)Compare all substrings of both strings with each other and find the longest common substring. That’s the answer.

    import math
    def subStringsOfStringUsingBinaryMethod(list1,len1):
    storingList =[]
    list2 =[]
    for i in range(int(len1+1)):
    list2.append(i)
    for i in list2:
    remainder = []
    temp =[]
    while i!= 0:
    remainder.append(i%2)
    i = i/2
    for i in range(len(remainder)):
    if remainder[i] == 1:
    temp.append(list1[i])
    storingList.append(temp)
    return storingList

    def longestCommonSubsequence(storingList,storingList1):
    stringList =[]
    for i in storingList:
    for j in storingList1:
    if i == j:
    stringList.append(i)
    max =0
    var =’a’
    for i in stringList:
    if len(i)>max:
    max = len(i)
    tempList =[]
    for i in stringList:

    if len(i) == max:
    if any([True for e in [i] if e in tempList]):
    continue
    else:
    tempList.append(i)
    for i in tempList:
    print "".join(i)

    def main():
    string1 = "bdcaba"
    string2 = "abcbdab"
    list1 = list(string1)
    list2 = list(string2)
    len1 = math.pow(2,len(string1))-1
    len2 = math.pow(2,len(string2))-1
    storingList = subStringsOfStringUsingBinaryMethod(list1,len1)
    storingList1 = subStringsOfStringUsingBinaryMethod(list2,len2)
    longestCommonSubsequence(storingList,storingList1)

    if __name__ == ‘__main__':
    main()

  • Laurence

    Something is wrong in the post. If(X[i-1] == Y[j-1]) we should not simply add 1 before we check the newly added character will keep the new longest common subsequence IN RELATIVE ORDER!!!

  • George

    “So a string of length n has 2^n different possible subsequences.”
    Aren’t 2^n-1 possible subsequences ?
    Regards

    • Prateek Jassal

      I agree George. It will be 2^n-1 because the number of ways of choosing 0 characters from n (which is 1 way) is also included in 2^n. That is not a valid subsequence. I would request the authors to please update this.

  • Sandeep

    Can we straight away remove the characters present in first input sequence X and not in the second input sequence and the vice versa.

     
    /* Paste your code here (You may delete these lines if not writing code) */
     
  • chanchal

    can anyone write a function to print all Longest Common Sequence with same max langth.
    for ex. strings are
    s1=”BDCABA”;
    s2=”ABCBDAB”;
    max length of LCS is 4 and there are two possible sequences.
    BDAB and BCAB

  • Shaanu

    Can you please provide and explains the suffix tree or suffix array implementation as it would cost O(n+m) in finding the length of LCS..

  • sparco

    Code for First row and column space optimised in the code
    Instead of L[m+1][n+1] , L[m][n] can be used

     
    /* Paste your code here (You may delete these lines if not writing code) */
    /* Dynamic Programming implementation of LCS problem */
    #include<stdio.h>
    #include<stdlib.h>
     
    int max(int a, int b);
     
    /* Returns length of LCS for X[0..m-1], Y[0..n-1] */
    int lcs( char *X, char *Y, int m, int n )
    {
       int L[m][n];
       int i, j;
     
       /* Following steps build L[m+1][n+1] in bottom up fashion. Note
          that L[i][j] contains length of LCS of X[0..i-1] and Y[0..j-1] */
       for (i=0; i<m; i++)
       {
         for (j=0; j<n; j++)
         {
           if(X[i] == Y[j])
             L[i][j] = (i==0||j==0)?1:L[i-1][j-1]+1;
           else
             L[i][j] = (i==0||j==0)?0:max(L[i-1][j], L[i][j-1]);
       printf("%d",L[i][j]);
         }
         printf("\n");
       }
     
    
       /* L[m][n] contains length of LCS for X[0..n-1] and Y[0..m-1] */
       return L[m-1][n-1];
    }
     
    /* Utility function to get max of 2 integers */
    int max(int a, int b)
    {
        return (a > b)? a : b;
    }
     
    /* Driver program to test above function */
    int main()
    {
      char X[] = "AGGTAB";
      char Y[] = "GXTXAYB";
     
      int m = strlen(X);
      int n = strlen(Y);
     
      printf("Length of LCS is %d\n", lcs( X, Y, m, n ) );
     
      getchar();
      return 0;
    }
    
     
    • Priyanka
       
      Can you please explain your
      approach little bit?
       
  • kartikaditya

    public class LongestCommonSubsequence {
    public static String getLongestCommonSubsequence(String s1, String s2) {
    int dp[][] = new int[s1.length()][s2.length()];
    dp[0][0] = (s1.charAt(0) == s2.charAt(0)) ? 1 : 0;
    for (int i = 1; i < s1.length(); ++i) {
    if (s1.charAt(i) == s2.charAt(0)) {
    dp[i][0] = 1;
    } else {
    dp[i][0] = dp[i – 1][0];
    }
    }
    for (int i = 1; i < s2.length(); ++i) {
    if (s1.charAt(0) == s2.charAt(i)) {
    dp[0][i] = 1;
    } else {
    dp[0][i] = dp[0][i – 1];
    }
    }
    for (int i = 1; i < s1.length(); ++i) {
    for (int j = 1; j < s2.length(); ++j) {
    if (s1.charAt(i) == s2.charAt(j)) {
    dp[i][j] = 1 + dp[i – 1][j – 1];
    } else {
    dp[i][j] = Math.max(dp[i][j – 1], dp[i – 1][j]);
    }
    }
    }
    StringBuffer sb = new StringBuffer();
    int x = s1.length() – 1, y = s2.length() – 1;
    while (x >= 0 && y >= 0) {
    if (s1.charAt(x) == s2.charAt(y)) {
    sb.append(s1.charAt(x));
    –x;
    –y;
    } else {
    if (x > 0 && dp[x – 1][y] == dp[x][y]) {
    –x;
    } else {
    –y;
    }
    }
    }
    return sb.reverse().toString();
    }

    public static void main(String[] args) {
    System.out.println(getLongestCommonSubsequence("ABCDGH", "AEDFHR"));
    System.out.println(getLongestCommonSubsequence("AGGTAB", "GXTXAYB"));
    }
    }

    • York Tsai

      Thank you. Your code is very clearly.

       
      /* Paste your code here (You may delete these lines if not writing code) */
       
  • aggarwald2002

    Below is java code that also prints the longest common subsequence. It uses bottom-up dp technique.

    public class LongestCommonSubsequence
    {

    /**
    * longest-common-subsequence cache
    */
    private static List<Character>[][] lcsCache;

    /**
    * We’ll use bottom-up dynamic programming technique.
    *
    * LCS(X[0..m-1], Y[0..n-1]) = {
    * 1. if character X[m-1] == Y[n-1], then this character will be part of LCS found for X[0..n-2] and Y[0..m-2]
    * LCS(X[0..m-2], Y[0..n-2]) + 1
    * 2. if character X[m-1] != Y[n-1], then
    * max(LCS(X[0..m-1], Y[0..n-2], LCS(X[0..m-2], Y[0..n-1])
    *
    * @return list containing longest-common-subsequence for strings X and Y.
    */
    private static List<Character> LCS(Character[] X, Character[] Y)
    {
    // Initialize lcsCache for Y[0] and X[0..n-1]
    for (int i = 0; i < X.length; ++i) {
    if (X[i].equals(Y[0])) {
    lcsCache[i][0] = new ArrayList<Character>();
    lcsCache[i][0].add(X[0]);
    }
    else {
    lcsCache[i][0] = Collections.emptyList();
    }
    }

    // Initialize lcsCache for X[0] and Y[1..m-1]
    for (int j = 1; j < Y.length; ++j) {
    if (X[0].equals(Y[j])) {
    lcsCache[0][j] = new ArrayList<Character>();
    lcsCache[0][j].add(X[0]);
    }
    else {
    lcsCache[0][j] = Collections.emptyList();
    }
    }

    for (int i = 1; i <= X.length – 1; ++i) {
    for (int j = 1; j <= Y.length – 1; ++j) {
    if (X[i].equals(Y[j])) {
    lcsCache[i][j] = lcsCache[i – 1][j – 1] != null
    ? new ArrayList<Character>(lcsCache[i – 1][j – 1]) : new ArrayList<Character>();
    lcsCache[i][j].add(X[i]);
    }
    else {
    lcsCache[i][j] = max(
    lcsCache[i][j – 1] != null ? lcsCache[i][j – 1] : Collections.EMPTY_LIST,
    lcsCache[i – 1][j] != null ? lcsCache[i – 1][j] : Collections.EMPTY_LIST);
    }
    }
    }

    return lcsCache[X.length – 1][Y.length – 1];
    }

    private static List<Character> max(List<Character> list1, List<Character> list2)
    {
    return (list1.size() >= list2.size() ? list1 : list2);
    }

    /**
    * @param args
    */
    public static void main(String[] args)
    {
    List<Character> input = new ArrayList<Character>();
    Character[] X = new Character[] { ‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’, ‘g’, ‘h’, ‘i’, ‘j’ };
    Character[] Y = new Character[] {
    ‘b’, ‘a’, ‘d’, ‘c’, ‘f’, ‘e’, ‘h’, ‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’, ‘g’, ‘h’, ‘i’, ‘j’, ‘g’, ‘j’, ‘i’ };
    lcsCache = new List[X.length][Y.length];

    List<Character> lcs = LCS(X, Y);
    System.out.println(lcs);
    }

    }

  • http://mobilityworld.wordpress.com Punit

    Here is Java version of the Dynamic Programming implementation of LCS problem that prints the LCS

    public class LCS {
    public static void main(String[] args) {
    String x = "AGGTAB";
    String y = "GXTXAYB";
    System.out.println("Length of LCS is "+ lcs(x,y, x.length(), y.length()) );
    }

    static String lcs(String x,String y, int m, int n){
    if(m == 0 || n == 0)
    return "";

    if(x.charAt(m-1) == y.charAt(n-1)){
    return lcs(x,y,m-1,n-1)+x.charAt(m-1);
    }

    else
    //return Math.max(lcs(x,y,m, n-1), lcs(x,y,m-1,n));
    return maxString(lcs(x,y,m, n-1), lcs(x,y,m-1,n));

    }

    public static String maxString(String a, String b){
    if(a.length() >= b.length())
    return a;
    else
    return b;
    }
    }

    • mohitk

      Hi,
      Just a query, but I think this would run in exponential time?
      This seems to be quite the opposite of Dynamic prog.

       
      /* Paste your code here (You may delete these lines if not writing code) */
       
  • InvalidUserName

    Typo in explanation in Optimal Substructure section example

    “Last characters do not match for strings “ABCDGH” and “AEDFHR”. So length of LCS can be written as:

    L(“AGGTAB”, “GXTXAYB”) = MAX ( L(“ABCDG”, “AEDFHR”), L(“ABCDGH”, “AEDFH”) )”

    Replace with:
    L(“ABCDGH”, “AEDFHR”) = MAX ( L(“ABCDG”, “AEDFHR”), L(“ABCDGH”, “AEDFH”) )

    -Ujjwal W

    • GeeksforGeeks

      @Ujjwal W: Thanks for pointing this out. We have corrected the typo. Keep it up!!

  • http://www.linkedin.com/in/ramanawithu Venki

    @Sandeep, the title is misleading. The programs prints only the length of LCS, not the LCS itself.

    The table approach is more understandable if we can print the table status after each iteration of inner loop (initially fill the table with zeros or -1). Simply print the 2D matrix after every inner loop. It would be helpful to the beginners of DP how the algorithm builds the table.

    • http://www.linkedin.com/in/ramanawithu Venki

      Hope the following code helps in printing the table at runtime.

       
      /* Dynamic Programming implementation of LCS problem */
      #include<stdio.h>
      #include<stdlib.h>
      #include<string.h>
      
      int max(int a, int b);
      
      /* Change these string to test the program */
      #define STRING_1 "MZJAWXU"
      #define STRING_2 "XMJYAXU"
      
      /* Algorithm visulization helper */
      void display(int *base, int row, int col)
      {
          int r, c;
      
          for(r = 0; r < row; r++)
          {
              for(c = 0; c < col; c++)
              {
                  printf("\t%4d", (base + r * col));
              }
      
              printf("\n");
          }
      }
      
      /* Returns length of LCS for X[0..m-1], Y[0..n-1] */
      int lcs( char *X, char *Y, int m, int n )
      {
          int L[sizeof(STRING_1)][sizeof(STRING_2)];
          int i, j;
      
          for (i = 0; i <= m; i++)
          {
              for (j = 0; j <= n; j++)
              {
                  L[i][j] = -1;
              }
          }
      
          /* Following steps build L[m+1][n+1] in bottom up fashion. Note
          that L[i][j] contains length of LCS of X[0..i-1] and Y[0..j-1] */
          for (i = 0; i <= m; i++)
          {
              for (j = 0; j <= n; j++)
              {
                  if (i == 0 || j == 0)
                      L[i][j] = 0;
      
                  else if(X[i-1] == Y[j-1])
                      L[i][j] = L[i-1][j-1] + 1;
      
                  else
                      L[i][j] = max(L[i-1][j], L[i][j-1]);
              }
      
              /* Algorithm visulization code */
              display((int *)L, m+1, n+1);
              printf("\n\n");
          }
      
          /* L[m][n] contains length of LCS for X[0..n-1] and Y[0..m-1] */
          return L[m][n];
      }
      
      /* Utility function to get max of 2 integers */
      int max(int a, int b)
      {
          return (a > b) ? a : b;
      }
      
      /* Driver program to test above function */
      int main()
      {
          char X[] = STRING_1;
          char Y[] = STRING_2;
      
          int m = sizeof(STRING_1) - 1;
          int n = sizeof(STRING_2) - 1;
      
          printf("Length of LCS is %d\n", lcs( X, Y, m, n ) );
      
          return 0;
      }
       
  • rajeev

    Nice post! One of the best contents for DP.

    How to print LCS string?

    • Doom

      here I have written a backtrack function to print the LCS string. However, it prints one of the many(if exist) same length strings. Can someone modify it to print all the strings with same length??
      In the code, c1 and c2 are the respective lengths of the strings and mat is a 2D matrix of size (c1+1)*(c2+1).
      This function is called after LCS populates the matrix.

       
      void backtrack(char *s1,char *s2,int **mat,int c1,int c2)
      {
              if(c1!=0 && c2!=0)
              if(s1[c1-1]==s2[c2-1])
              {
                      backtrack(s1,s2,mat,c1-1,c2-1);
                      printf("%c",s1[c1-1]);
              }
              else
              {
                      if(mat[c1-1][c2] > mat[c1][c2-1])
                              backtrack(s1,s2,mat,c1-1,c2);
                      else
                              backtrack(s1,s2,mat,c1,c2-1);
              }
       
      }
       
      • Teste

        How LCS populates the matrix?

      • Yatendra Goel

        The code below will print all the possible LCSs.

        Note1: This code prints a LCS multiple times if multiple paths leads to the same LCS. One dirty way to fix this by store all the LCSs first and then print only unique LCS. So if you guys have any better idea to stop printing multiple LCS then please reply

        Note2: The below recursive function can be memoized as it solves various same sub-problems multiple times

         
        	private void printAllLCS_Strings_Recursively(int i, int j, String currentLCSStr) {;
        
        		if (i == 0 || j == 0) { // base-case
        			System.out.println("LCS_String: " + currentLCSStr);
        			return;
        		}
        		
        		/*
        		 * Reducing both i & j by 1 because they represent matrix index and
        		 * matrix contains one dummy row and column.
        		 */
        		if (x[i-1] == y[j-1]) {
        			printAllLCS_Strings_Recursively(i-1, j-1, x[i-1] + currentLCSStr);
        		} else {
        			if (c[i-1][j] > c[i][j-1]) // top-cell
        				printAllLCS_Strings_Recursively(i-1, j, currentLCSStr);
        			else if (c[i-1][j] < c[i][j-1]) // left-cell
        				printAllLCS_Strings_Recursively(i, j-1, currentLCSStr);
        			else {
        				
        				printAllLCS_Strings_Recursively(i, j-1, currentLCSStr);
        				printAllLCS_Strings_Recursively(i-1, j, currentLCSStr);
        				
        			}
        		}
        	}
         
  • kartik

    Following is memoized version for the same problem.

     
    /* Memoized implementation of LCS problem */
    #include<stdio.h>
    #include<stdlib.h>
    #define NIL -1
    #define MAX 1000
     
    int max(int a, int b);
    int L[MAX][MAX];
    
    /* Returns length of LCS for X[0..m-1], Y[0..n-1] */
    int lcs( char *X, char *Y, int m, int n )
    {    
       if (L[m][n] == NIL)     
       {
         if (m == 0 || n == 0)
           L[m][n] = 0;
         else if(X[m-1] == Y[n-1])
           L[m][n] = 1 + lcs(X, Y, m-1, n-1);
         else
           L[m][n] = max(lcs(X, Y, m, n-1), lcs(X, Y, m-1, n));
       }     
       
       return L[m][n];
    }
    
    void _initialize(int m, int n)
    {
       int i, j;  
       for (i=0; i<=m; i++)
         for (j=0; j<=n; j++)
            L[i][j] = NIL;
    }
    
    /* Utility function to get max of 2 integers */
    int max(int a, int b)
    {
        return (a > b)? a : b;
    }
     
    /* Driver program to test above function */
    int main()
    {
      char X[] = "AGGTAB";
      char Y[] = "GXTXAYB";
     
      int m = strlen(X);
      int n = strlen(Y);
     
      _initialize(m, n);
      printf("Length of LCS is %d\n", lcs( X, Y, m, n ) );
     
      getchar();
      return 0;
    }