Shortest Uncommon Subsequence

4.1

Given two strings S and T, find length of the shortest subsequence in S which is not a subsequence in T. If no such subsequence is possible, return -1. A subsequence is a sequence that appears in the same relative order, but not necessarily contiguous. A string of length n has 2^n different possible subsequences.

String S of length m (1 <= m <= 1000)
String T of length n (1 <= n <= 1000)

Examples:

Input : S = “babab” T = “babba”
Output : 3
The subsequence “aab” of length 3 is 
present in S but not in T.

Input :  S = “abb” T = “abab”
Output : -1
There is no subsequence that is present 
in S but not in T.

Brute Force Method
A simple approach will be to generate all the subsequences of string S and for each subsequence check if it is present in string T or not. Consider example 2 in which S = “abb”,
its subsequences are “”, ”a”, ”b”, ”ab”, ”bb”, ”abb” each of which is present in “abab”. The time complexity of this solution will be exponential.

Efficient (Dynamic Programming)

1.Optimal substructure : Consider two strings S and T of length m and n respectively & let the function to find the shortest uncommon subsequence be shortestSeq (char *S, char *T). For each character in S, if it is not present in T then that character is the answer itself.
Otherwise if it is found at index k then we have the choice of either including it in the shortest uncommon subsequence or not.

If it is included answer = 1 + ShortestSeq( S + 1, T + k + 1) 
If not included answer =  ShortestSeq( S + 1, T) 
The minimum of the two is the answer.

Thus we can see that this problem has optimal substructure property as it can be solved by using solution to sub problems.

2.Overlapping Sub problems
Following is a simple recursive implementation of the above problem.

// A simple recursive C++ program to find shortest
// uncommon subsequence.
#include<bits/stdc++.h>
using namespace std;

#define MAX 1005

/* A recursive function to find the length of
   shortest uncommon subsequence*/
int shortestSeq(char *S, char *T, int m, int n)
{
    // S string is empty
    if (m == 0)
        return MAX;

    // T string is empty
    if (n <= 0)
        return 1;

    // Loop to search for current character
    int k;
    for (k=0; k < n; k++)
        if (T[k] == S[0])
            break;

    // char not found in T
    if (k == n)
        return 1;

    // Return minimum of following two
    // Not including current char in answer
    // Including current char
    return min(shortestSeq(S+1, T, m-1, n),
            1 + shortestSeq(S+1, T+k+1, m-1, n-k-1));
}

// Driver program to test the above function
int main()
{
    char S[] = "babab";
    char T[] = "babba";
    int m = strlen(S), n = strlen(T);
    int ans = shortestSeq(S, T, m, n);
    if (ans >= MAX)
       ans = -1;
    cout << "Length of shortest subsequence is: "
         << ans << endl;
    return 0;
}
Length of shortest subsequence is : 3

If we draw the complete recursion tree, then we can see that there are many sub problems which are solved again and again. So this problem has Overlapping Substructure property and recomputation of same sub problems can be avoided by either using Memoization or Tabulation. Following is a tabulated implementation for the problem.

LongestUncommonSubSeq

// A dynamic programming based C++ program
// to find shortest uncommon subsequence.
#include<bits/stdc++.h>
using namespace std;

#define MAX 1005

// Returns length of shortest common subsequence
int shortestSeq(char *S, char *T)
{
    int m = strlen(S), n = strlen(T);

    // declaring 2D array of m + 1 rows and
    // n + 1 columns dynamically
    int dp[m+1][n+1];

    // T string is empty
    for (int i = 0; i <= m; i++)
        dp[i][0] = 1;

    // S string is empty
    for (int i = 0; i <= n; i++)
        dp[0][i] = MAX;

    for (int i = 1; i <= m; i++)
    {
        for (int j = 1; j <= n; j++)
        {
            char ch = S[i-1];
            int k;
            for (k = j-1; k >= 0; k--)
                if (T[k] == ch)
                    break;

            // char not present in T
            if (k == -1)
                dp[i][j] = 1;
            else
               dp[i][j] = min(dp[i-1][j], dp[i-1][k] + 1);
        }
    }

    int ans = dp[m][n];
    if (ans >= MAX)
        ans = -1;

    return ans;
}

// Driver program to test the above function
int main()
{
    char S[] = "babab";
    char T[] = "babba";
    int m = strlen(S), n = strlen(T);
    cout << "Length of shortest subsequence is : "
         << shortestSeq(S, T) << endl;
}

Output:

Length of shortest subsequence is : 3
Time complexity : O(mn^2)
Space Complexity : O(mn)

This article is contributed by Aditi Sharma. If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.

GATE CS Corner    Company Wise Coding Practice

Recommended Posts:



4.1 Average Difficulty : 4.1/5.0
Based on 16 vote(s)










Writing code in comment? Please use ide.geeksforgeeks.org, generate link and share the link here.