Skip to content
Related Articles

Related Articles

Improve Article

Count of occurrences of each prefix in a string using modified KMP algorithm

  • Difficulty Level : Medium
  • Last Updated : 14 Dec, 2020
Geek Week

Given a string S of size N, the task is to count the occurrences of all the prefixes of the given string S.

Examples:  

Input: S = “AAAA” 
Output: 
A occurs 4 times 
AA occurs 3 times. 
AAA occurs 2 times. 
AAAA occurs 1 times. 
Explanation: 
Below is the illustration of all the prefix: 
 

Input: S = “ABACABA” 
Output: 
A occurs 4 times 
AB occurs 2 times 
ABA occurs 2 times 
ABAC occurs 1 times 
ABACA occurs 1 times 
ABACAB occurs 1 times 
ABACABA occurs 1 times 
 



Naive Approach:  

  1. Traverse over all the prefixes in set P. Let the x be the prefix.
  2. Do a sliding window approach of size |x|.
  3. Check if the current sliding window on S is equal to x. If yes then increase the count[x] by 1.

Time complexity: O(N3
Auxiliary Space: O(N)

Efficient Approach: 
Use the LPS array (also called prefix_function) from the KMP algorithm
The prefix function for this string is defined as an array LPS of length N, where LPS[i] is the length of the longest proper prefix of the substring S[0…i] which is also a suffix of this substring. Let occ[i] denote the number of occurrences of the prefix of length i.

Below are the steps to implement this approach:  

  1. Compute the LPS array or prefix_function.
  2. For each value of the prefix function, first, count how many times it occurs in the LPS array.
  3. The length prefix i appears exactly ans[i] times, then this number must be added to the number of occurrences of its longest suffix that is also a prefix.
  4. In the end, add 1 to all the values of occ array, because of the original prefix that should be counted as well.

For example: 
LPS[i] denotes that in position i, a prefix of length = LPS[i] appears. And this is the longest prefix possible. But shorter prefixes can occur. 
For String S = “AAAA”, following are the prefixes: 
 

S[0..0] = A 
S[0..1] = AA 
S[0..2] = AAA 
S[0..3] = AAAA 

Initially:  

occ[A] = 0 
occ[AA] = 0 
occ[AAA] = 0 
occ[AAAA] = 0 

Step1: LPS Array of the following string denotes the length of the longest prefix which is also a suffix: 



LPS[1] denotes in string AA, A is a suffix and also a prefix as LPS[1] = 1 
LPS[2] denotes in string AAA, AA is a suffix and also a prefix as LPS[2] = 2 
LPS[3] denotes in string AAAA, AAA is a suffix and also a prefix as LPS[3] = 3

Step 2:Add these occurrences of prefixes as suffixes to the answer in the occ[] array:  

Values : Counted substrings 
occ[A] = 1 : S[1] 
occ[AA] = 1 : S[1..2] 
occ[AAA] = 1 : S[1..3] 
occ[AAAA] = 0 : NULL(as there is not a prefix “AAAA” which is also a suffix. 
 

Step 3: Now traverse the string in reverse order starting from “AAA” (as the last value will always be 0 since the complete string is not a proper prefix). 

Since, string “AAA” S[1..3] contains “AA” S[2..3] as well, which was not counted yet, therefore increment the occurrence of string “AA” in occ[“AA”] as occ[“AA”] += occ[“AAA”]. Below is the count for the same: 
Values : Counted substrings 
occ[A] = 1 : S[1] 
occ[AA] = 2 : S[1..2], S[2..3] 
occ[AAA] = 1 : S[1..3] 
occ[AAAA] = 0 : NULL 

Now string “AA” contains “A” as well, which was not counted yet, therefore increment the occurrence of string “A” in occ[“A”] as occ[“A”] += occ[“AA”]. Below is the count for the same: 
 

Values : Counted substrings 
occ[A] = 3 : S[1], S[2], S[3] 
occ[AA] = 2 : S[1..2], S[2..3] 
occ[AAA] = 1 : S[1..3] 
occ[AAAA] = 0 : NULL 

Step 4: At last add one to all occurrences for the original prefixes, which are not counted yet.  

Values : Counted substrings 
occ[A] = 4 : S[1], S[2], S[3], S[0] 
occ[AA] = 3 : S[1..2], S[2..3], S[0..1] 
occ[AAA] = 2 : S[1..3], S[0..2] 
occ[AAAA] = 1 : S[0..3]  

Below is the implementation of the above approach:

C++




// C++ program for the above approach
#include <bits/stdc++.h>
using namespace std;
 
// Function to print the count of all
// prefix in the given string
void print(vector<int>& occ, string& s)
{
    // Iterate over string s
    for (int i = 1; i <= int(s.size());
         i++) {
 
        // Print the prefix and their
        // frequency
        cout << s.substr(0, i)
             << " occurs "
             << occ[i]
             << " times."
             << endl;
    }
}
 
// Function to implement the LPS
// array to store the longest prefix
// which is also a suffix for every
// substring of the string S
vector<int> prefix_function(string& s)
{
    // Array to store LPS values
    vector<int> LPS(s.size());
 
    // Value of lps[0] is 0
    // by definition
    LPS[0] = 0;
 
    // Find the values of LPS[i] for
    // the rest of the string using
    // two pointers and DP
    for (int i = 1;
         i < int(s.size());
         i++) {
 
        // Initially set the value
        // of j as the longest
        // prefix that is also a
        // suffix for i as LPS[i-1]
        int j = LPS[i - 1];
 
        // Check if the suffix of
        // length j+1 is also a prefix
        while (j > 0 && s[i] != s[j]) {
            j = LPS[j - 1];
        }
 
        // If s[i] = s[j] then, assign
        // LPS[i] as j+1
        if (s[i] == s[j]) {
            LPS[i] = j + 1;
        }
 
        // If we reached j = 0, assign
        // LPS[i] as 0 as there was no
        // prefix equal to suffix
        else {
            LPS[i] = 0;
        }
    }
 
    // Return the calculated
    // LPS array
    return LPS;
}
 
// Function to count the occurrence
// of all the prefix in the string S
void count_occurence(string& s)
{
    int n = s.size();
 
    // Call the prefix_function
    // to get LPS
    vector<int> LPS
        = prefix_function(s);
 
    // To store the occurrence of
    // all the prefix
    vector<int> occ(n + 1);
 
    // Count all the suffixes that
    // are also prefix
    for (int i = 0; i < n; i++) {
        occ[LPS[i]]++;
    }
 
    // Add the occurences of
    // i to smaller prefixes
    for (int i = n - 1;
         i > 0; i--) {
        occ[LPS[i - 1]] += occ[i];
    }
 
    // Adding 1 to all occ[i] for all
    // the orignal prefix
    for (int i = 0; i <= n; i++)
        occ[i]++;
 
    // Function Call to print the
    // occurence of all the prefix
    print(occ, s);
}
 
// Driver Code
int main()
{
    // Given String
    string A = "ABACABA";
 
    // Function Call
    count_occurence(A);
    return 0;
}

Java




// Java program for
// the above approach
import java.util.*;
class GFG{
 
// Function to print the count
// of all prefix in the
// given String
static void print(int[] occ,
                  String s)
{
  // Iterate over String s
  for (int i = 1;
           i <= s.length() - 1; i++)
  {
    // Print the prefix and their
    // frequency
    System.out.print(s.substring(0, i) +
                     " occurs " + occ[i] +
                     " times." + "\n");
  }
}
 
// Function to implement the LPS
// array to store the longest prefix
// which is also a suffix for every
// subString of the String S
static int[] prefix_function(String s)
{
  // Array to store LPS values
  int []LPS = new int[s.length()];
 
  // Value of lps[0] is 0
  // by definition
  LPS[0] = 0;
 
  // Find the values of LPS[i] for
  // the rest of the String using
  // two pointers and DP
  for (int i = 1;
       i < s.length(); i++)
  {
    // Initially set the value
    // of j as the longest
    // prefix that is also a
    // suffix for i as LPS[i-1]
    int j = LPS[i - 1];
 
    // Check if the suffix of
    // length j+1 is also a prefix
    while (j > 0 &&
           s.charAt(i) != s.charAt(j))
    {
      j = LPS[j - 1];
    }
 
    // If s[i] = s[j] then, assign
    // LPS[i] as j+1
    if (s.charAt(i) == s.charAt(j))
    {
      LPS[i] = j + 1;
    }
 
    // If we reached j = 0, assign
    // LPS[i] as 0 as there was no
    // prefix equal to suffix
    else
    {
      LPS[i] = 0;
    }
  }
 
  // Return the calculated
  // LPS array
  return LPS;
}
 
// Function to count the occurrence
// of all the prefix in the String S
static void count_occurence(String s)
{
  int n = s.length();
 
  // Call the prefix_function
  // to get LPS
  int[] LPS = prefix_function(s);
 
  // To store the occurrence of
  // all the prefix
  int []occ = new int[n + 1];
 
  // Count all the suffixes that
  // are also prefix
  for (int i = 0; i < n; i++)
  {
    occ[LPS[i]]++;
  }
 
  // Add the occurences of
  // i to smaller prefixes
  for (int i = n - 1;
           i > 0; i--)
  {
    occ[LPS[i - 1]] += occ[i];
  }
 
  // Adding 1 to all occ[i] for all
  // the orignal prefix
  for (int i = 0; i <= n; i++)
    occ[i]++;
 
  // Function Call to print the
  // occurence of all the prefix
  print(occ, s);
}
 
// Driver Code
public static void main(String[] args)
{
  // Given String
  String A = "ABACABA";
 
  // Function Call
  count_occurence(A);
}
}
 
// This code is contributed by Princi Singh

Python3




# Python3 program for the above approach
 
# Function to print the count of all
# prefix in the given string
def Print(occ, s):
     
    # Iterate over string s
    for i in range(1, len(s) + 1):
 
        # Print the prefix and their
        # frequency
        print(s[0 : i], "occur", occ[i], "times.")
 
# Function to implement the LPS
# array to store the longest prefix
# which is also a suffix for every
# substring of the string S
def prefix_function(s):
 
    # Array to store LPS values
    # Value of lps[0] is 0
    # by definition
    LPS = [0 for i in range(len(s))]
     
    # Find the values of LPS[i] for
    # the rest of the string using
    # two pointers and DP
    for i in range(1, len(s)):
 
        # Initially set the value
        # of j as the longest
        # prefix that is also a
        # suffix for i as LPS[i-1]
        j = LPS[i - 1]
 
        # Check if the suffix of
        # length j+1 is also a prefix
        while (j > 0 and s[i] != s[j]):
            j = LPS[j - 1]
 
        # If s[i] = s[j] then, assign
        # LPS[i] as j+1
        if (s[i] == s[j]):
            LPS[i] = j + 1
             
        # If we reached j = 0, assign
        # LPS[i] as 0 as there was no
        # prefix equal to suffix
        else:
            LPS[i] = 0
 
    # Return the calculated
    # LPS array
    return LPS
 
# Function to count the occurrence
# of all the prefix in the string S
def count_occurence(s):
     
    n = len(s)
 
    # Call the prefix_function
    # to get LPS
    LPS = prefix_function(s)
 
    # To store the occurrence of
    # all the prefix
    occ = [0 for i in range(n + 1)]
 
    # Count all the suffixes that
    # are also prefix
    for i in range(n):
        occ[LPS[i]] += 1
 
    # Add the occurences of
    # i to smaller prefixes
    for i in range(n - 1, 0, -1):
        occ[LPS[i - 1]] += occ[i]
     
    # Adding 1 to all occ[i] for all
    # the orignal prefix
    for i in range(n + 1):
        occ[i] += 1
         
    # Function Call to print the
    # occurence of all the prefix
    Print(occ, s)
 
# Driver Code
 
# Given String
A = "ABACABA"
 
# Function Call
count_occurence(A)
 
# This code is contributed by avanitrachhadiya2155

C#




// C# program for
// the above approach
using System;
class GFG{
 
// Function to print the
// count of all prefix
// in the given String
static void print(int[] occ,
                  String s)
{
  // Iterate over String s
  for (int i = 1;
           i <= s.Length - 1; i++)
  {
    // Print the prefix and their
    // frequency
    Console.Write(s.Substring(0, i) + 
                  " occurs " + occ[i] + 
                  " times." + "\n");
  }
}
 
// Function to implement the LPS
// array to store the longest prefix
// which is also a suffix for every
// subString of the String S
static int[] prefix_function(String s)
{
  // Array to store LPS values
  int []LPS = new int[s.Length];
 
  // Value of lps[0] is 0
  // by definition
  LPS[0] = 0;
 
  // Find the values of LPS[i] for
  // the rest of the String using
  // two pointers and DP
  for (int i = 1;
           i < s.Length; i++)
  {
    // Initially set the value
    // of j as the longest
    // prefix that is also a
    // suffix for i as LPS[i-1]
    int j = LPS[i - 1];
 
    // Check if the suffix of
    // length j+1 is also a prefix
    while (j > 0 && s[i] != s[j])
    {
      j = LPS[j - 1];
    }
 
    // If s[i] = s[j] then,
    // assign LPS[i] as j+1
    if (s[i] == s[j])
    {
      LPS[i] = j + 1;
    }
 
    // If we reached j = 0, assign
    // LPS[i] as 0 as there was no
    // prefix equal to suffix
    else
    {
      LPS[i] = 0;
    }
  }
 
  // Return the calculated
  // LPS array
  return LPS;
}
 
// Function to count the occurrence
// of all the prefix in the String S
static void count_occurence(String s)
{
  int n = s.Length;
 
  // Call the prefix_function
  // to get LPS
  int[] LPS = prefix_function(s);
 
  // To store the occurrence of
  // all the prefix
  int []occ = new int[n + 1];
 
  // Count all the suffixes that
  // are also prefix
  for (int i = 0; i < n; i++)
  {
    occ[LPS[i]]++;
  }
 
  // Add the occurences of
  // i to smaller prefixes
  for (int i = n - 1;
           i > 0; i--)
  {
    occ[LPS[i - 1]] += occ[i];
  }
 
  // Adding 1 to all occ[i] for all
  // the orignal prefix
  for (int i = 0; i <= n; i++)
    occ[i]++;
 
  // Function Call to print the
  // occurence of all the prefix
  print(occ, s);
}
 
// Driver Code
public static void Main(String[] args)
{
  // Given String
  String A = "ABACABA";
 
  // Function Call
  count_occurence(A);
}
}
 
// This code is contributed by Amit Katiyar
Output: 
A occurs 4 times.
AB occurs 2 times.
ABA occurs 2 times.
ABAC occurs 1 times.
ABACA occurs 1 times.
ABACAB occurs 1 times.
ABACABA occurs 1 times.

 

Time Complexity: O(N2) 
Axillary Space: O(N)

Attention reader! Don’t stop learning now. Get hold of all the important DSA concepts with the DSA Self Paced Course at a student-friendly price and become industry ready.  To complete your preparation from learning a language to DS Algo and many more,  please refer Complete Interview Preparation Course.

In case you wish to attend live classes with experts, please refer DSA Live Classes for Working Professionals and Competitive Programming Live for Students.




My Personal Notes arrow_drop_up
Recommended Articles
Page :