Count distinct occurrences as a subsequence

Given a two strings S and T, find the count of distinct occurrences of T in S as a subsequence.

Examples:

Input: S = banana, T = ban
Output: 3
Explanation: T appears in S as below three subsequences.
[ban], [ba  n], [b   an]

Input: S = geeksforgeeks, T = ge
Output: 6
Explanation: T appears in S as below three subsequences.
[ge], [     ge], [g e], [g    e] [g     e]
and [     g e]      



Approach: Create a recursive function such that it returns count of subsequences of S that match T. Here m is the length of T and n is length of S. This problem can be recursively defined as below.

    Base Cases:

    1. Given the string T is an empty string, returning 1 as an empty string can be the subsequence of all.
    2. Given the string S is an empty string, returning 0 as no string can be the subsequence of an empty string.

    Recursive cases:

    1. If the last character of S and T do not match, then remove the last character of S and call the recursive function again. Because the last character of S cannot be a part of the subsequence or remove it and check for other characters.
    2. If the last character of S match then there can be two possibilities, first there can be a subsequence where the last character of S is a part of it and second where it is not a part of the subsequence. So the required value will be the sum of both. Call the recursive function once with last character of both the strings removed and again with only last character of S removed.


Blue round rectangles represent accepted states or there are a subsequence and red round rectanges represent No subsequence can be formed.
Since there are overlapping subproblems in the above recurrence result, Dynamic Programming approach can be applied to solve the above problem. Store the subproblems in a Hashmap or an array and return the value when the function is called again.

Algorithm:

  1. Create a 2D array mat[m+1][n+1] where m is length of string T and n is length of string S. mat[i][j] denotes the number of distinct subsequence of substring S(1..i) and substring T(1..j) so mat[m][n] contains our solution.
  2. Initialize the first column with all 0s. An empty string can’t have another string as suhsequence
  3. Initialize the first row with all 1s. An empty string is subsequence of all.
  4. Fill the matrix in bottom up manner, i.e. all the sub problems of the current string is calculated first.
  5. Traverse the string T from start to end. (counter is i)
  6. For every iteration of the outer loop, Traverse the string S from start to end. (counter is j)
  7. If the character at ith index of string T matches with jth character of string S, the value is obtained considering two cases. First, is all the substrings without last character in S and second is the substrings without last characters in both, i.e mat[i+1][j] + mat[i][j] .
  8. Else the value will be same even if jth character of S is removed, i.e. mat[i+1][j]
  9. Print the value of mat[m-1][n-1] as the answer.

C++

filter_none

edit
close

play_arrow

link
brightness_4
code

/* C/C++ program to count number of times S appears
   as a subsequence in T */
#include <bits/stdc++.h>
using namespace std;
  
int findSubsequenceCount(string S, string T)
{
    int m = T.length(), n = S.length();
  
    // T can't appear as a subsequence in S
    if (m > n)
        return 0;
  
    // mat[i][j] stores the count of occurrences of
    // T(1..i) in S(1..j).
    int mat[m + 1][n + 1];
  
    // Initializing first column with all 0s. An empty
    // string can't have another string as suhsequence
    for (int i = 1; i <= m; i++)
        mat[i][0] = 0;
  
    // Initializing first row with all 1s. An empty
    // string is subsequence of all.
    for (int j = 0; j <= n; j++)
        mat[0][j] = 1;
  
    // Fill mat[][] in bottom up manner
    for (int i = 1; i <= m; i++) {
        for (int j = 1; j <= n; j++) {
  
            // If last characters don't match, then value
            // is same as the value without last character
            // in S.
            if (T[i - 1] != S[j - 1])
                mat[i][j] = mat[i][j - 1];
  
            // Else value is obtained considering two cases.
            // a) All substrings without last character in S
            // b) All substrings without last characters in
            // both.
            else
                mat[i][j] = mat[i][j - 1] + mat[i - 1][j - 1];
        }
    }
  
    /* uncomment this to print matrix mat
    for (int i = 1; i <= m; i++, cout << endl)
        for (int j = 1; j <= n; j++)
            cout << mat[i][j] << " ";  */
    return mat[m][n];
}
  
// Driver code to check above method
int main()
{
    string T = "ge";
    string S = "geeksforgeeks";
    cout << findSubsequenceCount(S, T) << endl;
    return 0;
}

chevron_right


Java

filter_none

edit
close

play_arrow

link
brightness_4
code

// Java program to count number of times
// S appears as a subsequence in T
import java.io.*;
  
class GFG {
    static int findSubsequenceCount(String S, String T)
    {
        int m = T.length();
        int n = S.length();
  
        // T can't appear as a subsequence in S
        if (m > n)
            return 0;
  
        // mat[i][j] stores the count of
        // occurrences of T(1..i) in S(1..j).
        int mat[][] = new int[m + 1][n + 1];
  
        // Initializing first column with
        // all 0s. An emptystring can't have
        // another string as suhsequence
        for (int i = 1; i <= m; i++)
            mat[i][0] = 0;
  
        // Initializing first row with all 1s.
        // An empty string is subsequence of all.
        for (int j = 0; j <= n; j++)
            mat[0][j] = 1;
  
        // Fill mat[][] in bottom up manner
        for (int i = 1; i <= m; i++) {
            for (int j = 1; j <= n; j++) {
                // If last characters don't match,
                // then value is same as the value
                // without last character in S.
                if (T.charAt(i - 1) != S.charAt(j - 1))
                    mat[i][j] = mat[i][j - 1];
  
                // Else value is obtained considering two cases.
                // a) All substrings without last character in S
                // b) All substrings without last characters in
                // both.
                else
                    mat[i][j] = mat[i][j - 1] + mat[i - 1][j - 1];
            }
        }
  
        /* uncomment this to print matrix mat
        for (int i = 1; i <= m; i++, cout << endl)
            for (int j = 1; j <= n; j++)
                System.out.println ( mat[i][j] +" "); */
        return mat[m][n];
    }
  
    // Driver code to check above method
    public static void main(String[] args)
    {
        String T = "ge";
        String S = "geeksforgeeks";
        System.out.println(findSubsequenceCount(S, T));
    }
}
// This code is contributed by vt_m

chevron_right


Python3

filter_none

edit
close

play_arrow

link
brightness_4
code

# Python3 program to count number of times 
# S appears as a subsequence in T
def findSubsequenceCount(S, T):
  
    m = len(T)
    n = len(S)
  
    # T can't appear as a subsequence in S
    if m > n:
        return 0
  
    # mat[i][j] stores the count of 
    # occurrences of T(1..i) in S(1..j).
    mat = [[0 for _ in range(n + 1)]
              for __ in range(m + 1)]
  
    # Initializing first column with all 0s. x
    # An empty string can't have another
    # string as suhsequence
    for i in range(1, m + 1):
        mat[i][0] = 0
  
    # Initializing first row with all 1s. 
    # An empty string is subsequence of all.
    for j in range(n + 1):
        mat[0][j] = 1
  
    # Fill mat[][] in bottom up manner
    for i in range(1, m + 1):
        for j in range(1, n + 1):
  
            # If last characters don't match, 
            # then value is same as the value 
            # without last character in S.
            if T[i - 1] != S[j - 1]:
                mat[i][j] = mat[i][j - 1]
                  
            # Else value is obtained considering two cases.
            # a) All substrings without last character in S
            # b) All substrings without last characters in
            # both.
            else:
                mat[i][j] = (mat[i][j - 1] + 
                             mat[i - 1][j - 1])
  
    return mat[m][n]
  
# Driver Code
if __name__ == "__main__":
    T = "ge"
    S = "geeksforgeeks"
    print(findSubsequenceCount(S, T))
  
# This code is contributed 
# by vibhu4agarwal

chevron_right


C#

filter_none

edit
close

play_arrow

link
brightness_4
code

// C# program to count number of times
// S appears as a subsequence in T
using System;
  
class GFG {
  
    static int findSubsequenceCount(string S, string T)
    {
        int m = T.Length;
        int n = S.Length;
  
        // T can't appear as a subsequence in S
        if (m > n)
            return 0;
  
        // mat[i][j] stores the count of
        // occurrences of T(1..i) in S(1..j).
        int[, ] mat = new int[m + 1, n + 1];
  
        // Initializing first column with
        // all 0s. An emptystring can't have
        // another string as suhsequence
        for (int i = 1; i <= m; i++)
            mat[i, 0] = 0;
  
        // Initializing first row with all 1s.
        // An empty string is subsequence of all.
        for (int j = 0; j <= n; j++)
            mat[0, j] = 1;
  
        // Fill mat[][] in bottom up manner
        for (int i = 1; i <= m; i++) {
  
            for (int j = 1; j <= n; j++) {
  
                // If last characters don't match,
                // then value is same as the value
                // without last character in S.
                if (T[i - 1] != S[j - 1])
                    mat[i, j] = mat[i, j - 1];
  
                // Else value is obtained considering two cases.
                // a) All substrings without last character in S
                // b) All substrings without last characters in
                // both.
                else
                    mat[i, j] = mat[i, j - 1] + mat[i - 1, j - 1];
            }
        }
  
        /* uncomment this to print matrix mat
        for (int i = 1; i <= m; i++, cout << endl)
            for (int j = 1; j <= n; j++)
                System.out.println ( mat[i][j] +" "); */
        return mat[m, n];
    }
  
    // Driver code to check above method
    public static void Main()
    {
        string T = "ge";
        string S = "geeksforgeeks";
        Console.WriteLine(findSubsequenceCount(S, T));
    }
}
  
// This code is contributed by vt_m

chevron_right


PHP

filter_none

edit
close

play_arrow

link
brightness_4
code

<?php
// PHP program to count number of times 
// S appears as a subsequence in T */
  
function findSubsequenceCount($S, $T)
{
    $m = strlen($T); $n = strlen($S);
  
    // T can't appear as a subsequence in S
    if ($m > $n)
        return 0;
  
    // mat[i][j] stores the count of 
    // occurrences of T(1..i) in S(1..j).
    $mat = array(array());
  
    // Initializing first column with all 0s. 
    // An empty string can't have another 
    // string as suhsequence
    for ($i = 1; $i <= $m; $i++)
        $mat[$i][0] = 0;
  
    // Initializing first row with all 1s. 
    // An empty string is subsequence of all.
    for ($j = 0; $j <= $n; $j++)
        $mat[0][$j] = 1;
  
    // Fill mat[][] in bottom up manner
    for ($i = 1; $i <= $m; $i++)
    {
        for ($j = 1; $j <= $n; $j++)
        {
            // If last characters don't match, 
            // then value is same as the value
            // without last character in S.
            if ($T[$i - 1] != $S[$j - 1])
                $mat[$i][$j] = $mat[$i][$j - 1];
  
            // Else value is obtained considering two cases.
            // a) All substrings without last character in S
            // b) All substrings without last characters in
            // both.
            else
                $mat[$i][$j] = $mat[$i][$j - 1] + 
                               $mat[$i - 1][$j - 1];
        }
    }
  
    /* uncomment this to print matrix mat
    for (int i = 1; i <= m; i++, cout << endl)
        for (int j = 1; j <= n; j++)
            cout << mat[i][j] << " "; */
    return $mat[$m][$n];
}
  
// Driver Code
$T = "ge";
$S = "geeksforgeeks";
echo findSubsequenceCount($S, $T) . "\n";
  
// This code is contributed 
// by Akanksha Rai

chevron_right



Output:

6

Complexity Analysis:

  • Time Complexity: O(m*n).
    Only one traversal of the matrix is needed, so the time Compelxiy is O(m*n)
  • Auxiliary Space: O(m*n).
    A matrix of size m*n is needed so the space complexity is O(m*n).
    Note:Since mat[i][j] accesses elements of the current row and previous row only, we can optimize auxiliary space just by using two rows only reducing space from m*n to 2*n.

This article is contributed by Utkarsh Trivedi. If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.

GeeksforGeeks has prepared a complete interview preparation course with premium videos, theory, practice problems, TA support and many more features. Please refer Placement 100 for details




My Personal Notes arrow_drop_up