Given two strings s1 and s2, the task is to find the length of longest common subsequence present in both of them.
Input: s1 = “ABCDGH”, s2 = “AEDFHR”
LCS for input Sequences “AGGTAB” and “GXTXAYB” is “GTAB” of length 4.
Input: s1 = “striver”, s2 = “raj”
The naive solution for this problem is to generate all subsequences of both given sequences and find the longest matching subsequence. This solution is exponential in term of time complexity. The general recursive solution of the problem is to generate all subsequences of both given sequences and find the longest matching subsequence. Total possible combinations will be 2n. Hence recursive solution will take O(2n).
- Let the input sequences be X[0..m-1] and Y[0..n-1] of lengths m and n respectively. And let L(X[0..m-1], Y[0..n-1]) be the length of LCS of the two sequences X and Y. Following is the recursive definition of L(X[0..m-1], Y[0..n-1]).
- If last characters of both sequences match (or X[m-1] == Y[n-1]) then L(X[0..m-1], Y[0..n-1]) = 1 + L(X[0..m-2], Y[0..n-2])
- If last characters of both sequences do not match (or X[m-1] != Y[n-1]) then L(X[0..m-1], Y[0..n-1]) = MAX ( L(X[0..m-2], Y[0..n-1]), L(X[0..m-1], Y[0..n-2])
Given below is the recursive solution to the LCS problem:
Length of LCS: 4
Dynamic Programming using Memoization
Considering the above implementation, the following is a partial recursion tree for input strings “AXYT” and “AYZX”
lcs("AXYT", "AYZX") / \ lcs("AXY", "AYZX") lcs("AXYT", "AYZ") / \ / \ lcs("AX", "AYZX") lcs("AXY", "AYZ") lcs("AXY", "AYZ") lcs("AXYT", "AY")
In the above partial recursion tree, lcs(“AXY”, “AYZ”) is being solved twice. On drawing the complete recursion tree, it has been observed that there are many subproblems which are solved again and again. So this problem has Overlapping Substructure property and recomputation of same subproblems can be avoided by either using Memoization or Tabulation. The tabulation method has been discussed here.
A common point of observation to use memoization in the recursive code will be the two non-constant arguments M and N in every function call. The function has 4 arguments, but 2 arguments are constant which do not affect the Memoization. The repetitive calls occur for N and M which have been called previously. Following the below steps will help us to write the DP solution using memoization.
- Use a 2-D array to store the computed lcs(m, n) value at arr[m-1][n-1] as the string index starts from 0.
- Whenever the function with the same argument m and n are called again, do not perform any further recursive call and return arr[m-1][n-1] as the previous computation of the lcs(m, n) has already been stored in arr[m-1][n-1], hence reducing the recursive calls that happen more then once.
Below is the implementation of the above approach:
Length of LCS: 4
Time Complexity: O(N * M), where N and M is length of the first and second string respectively.
Auxiliary Space: (N * M)
- Longest Common Subsequence | DP-4
- Printing Longest Common Subsequence
- Longest Common Subsequence with at most k changes allowed
- LCS (Longest Common Subsequence) of three strings
- Longest Common Anagram Subsequence
- C++ Program for Longest Common Subsequence
- Longest Common Increasing Subsequence (LCS + LIS)
- Longest common anagram subsequence from N strings
- Edit distance and LCS (Longest Common Subsequence)
- Length of longest common subsequence containing vowels
- Longest common subsequence with permutations allowed
- Java Program for Longest Common Subsequence
- Python Program for Longest Common Subsequence
- Minimum cost to make Longest Common Subsequence of length k
- Printing Longest Common Subsequence | Set 2 (Printing All)
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to firstname.lastname@example.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.