Open In App

CSES Solutions – String Matching

Last Updated : 02 Apr, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

Given a string S and a pattern P, your task is to count the number of positions where the pattern occurs in the string.

Examples:

Input: S = “saippuakauppias”, P = “pp”
Output: 2
Explanation: “pp” appears 2 times in S.

Input: S = “aaaa”, P = “aa”
Output: 3
Explanation: “aa” appears 3 times in S.

Approach: To solve the problem, follow the below idea:

To find all occurrences of a pattern in a text we can use various String-Matching algorithms. The Knuth-Morris-Pratt (KMP) algorithm is a suitable choice for this problem. KMP is an efficient string-matching algorithm that can find all occurrences of a pattern in a string in linear time.

Concatenate the Pattern and Text: The first step is to concatenate the pattern and the text with a special character # in between. This is done to ensure that the pattern and text don’t overlap during the computation of the prefix function.

Compute the Prefix Function: The computePrefix function is used to compute the prefix function of the concatenated string. The prefix function for a position i in the string is defined as the maximum proper prefix of the substring ending at position i that is also a suffix of this substring. This function is a key part of the KMP algorithm.

Count the Occurrences: After the prefix function is computed, the next step is to count the number of occurrences of the pattern in the text. This is done by iterating over the prefix function array and checking how many times the pattern length appears in the array. Each time the pattern length appears in the array, it means an occurrence of the pattern has been found in the text.

Step-by-step algorithm:

  • Declare the prefix function array pi[], and the count of occurrences.
  • The prefix function is computed for the pattern string. This function calculates the longest proper prefix which is also a suffix for each substring of the pattern. This information is stored in the pi array.
  • The pattern string is concatenated with the text string, with a special character (#) in between to separate them.
  • Iterate over the concatenated string. For each character, check if it matches the current character of the pattern (using the pi[] array). If it does, move to the next character of both the pattern and the text. If it doesn’t, move to the next character of the text, but stay on the current character of the pattern (or move to the character indicated by the pi array).
  • Each time the end of the pattern is reached (i.e., all characters of the pattern have matched), increment the count of occurrences.
  • After the entire text has been scanned, print the count of occurrences.

Below is the implementation of the algorithm:

C++
#include <bits/stdc++.h>
using namespace std;

// Function to compute the prefix function of a string for
// KMP algorithm
vector<int> computePrefix(string S)
{
    int N = S.length();
    vector<int> pi(N);
    for (int i = 1; i < N; i++) {
        int j = pi[i - 1];
        // Find the longest proper prefix which is also a
        // suffix
        while (j > 0 && S[i] != S[j])
            j = pi[j - 1];
        if (S[i] == S[j])
            j++;
        pi[i] = j;
    }
    return pi;
}

// Function to count the number of occurrences of a pattern
// in a text using KMP algorithm
int countOccurrences(string S, string P)
{
    // Concatenate pattern and text with a special character
    // in between
    string combined = P + "#" + S;

    // Compute the prefix function
    vector<int> prefixArray = computePrefix(combined);

    int count = 0;
    // Count the number of times the pattern appears in the
    // text
    for (int i = 0; i < prefixArray.size(); i++) {
        if (prefixArray[i] == P.size())
            count++;
    }

    return count;
}

// Driver code
int main()
{
    string S = "saippuakauppias";
    string P = "pp";

    cout << countOccurrences(S, P) << "\n";
    return 0;
}
Java
import java.util.*;

public class KMPAlgorithm {
    
    // Function to compute the prefix function of a string for KMP algorithm
    static List<Integer> computePrefix(String S) {
        int N = S.length();
        List<Integer> pi = new ArrayList<>(Collections.nCopies(N, 0));
        for (int i = 1; i < N; i++) {
            int j = pi.get(i - 1);
            // Find the longest proper prefix which is also a suffix
            while (j > 0 && S.charAt(i) != S.charAt(j))
                j = pi.get(j - 1);
            if (S.charAt(i) == S.charAt(j))
                j++;
            pi.set(i, j);
        }
        return pi;
    }

    // Function to count the number of occurrences of a pattern in a text using KMP algorithm
    static int countOccurrences(String S, String P) {
        // Concatenate pattern and text with a special character in between
        String combined = P + "#" + S;

        // Compute the prefix function
        List<Integer> prefixArray = computePrefix(combined);

        int count = 0;
        // Count the number of times the pattern appears in the text
        for (int i = 0; i < prefixArray.size(); i++) {
            if (prefixArray.get(i) == P.length())
                count++;
        }

        return count;
    }

    // Driver code
    public static void main(String[] args) {
        String S = "saippuakauppias";
        String P = "pp";

        System.out.println(countOccurrences(S, P));
    }
}
Python
# Function to compute the prefix function of a string for
# KMP algorithm


def compute_prefix(s):
    n = len(s)
    pi = [0] * n
    j = 0
    for i in range(1, n):
        while j > 0 and s[i] != s[j]:
            j = pi[j - 1]
        if s[i] == s[j]:
            j += 1
        pi[i] = j
    return pi

# Function to count the number of occurrences of a pattern
# in a text using KMP algorithm


def count_occurrences(s, p):
    # Concatenate pattern and text with a special character
    # in between
    combined = p + "#" + s

    # Compute the prefix function
    prefix_array = compute_prefix(combined)

    count = 0
    # Count the number of times the pattern appears in the
    # text
    for pi in prefix_array:
        if pi == len(p):
            count += 1

    return count


# Driver code
if __name__ == "__main__":
    S = "saippuakauppias"
    P = "pp"

    print(count_occurrences(S, P))
C#
using System;
using System.Collections.Generic;

public class KMPAlgorithm
{
    // Function to compute the prefix function of a string for KMP algorithm
    static List<int> ComputePrefix(string S)
    {
        int N = S.Length;
        List<int> pi = new List<int>(new int[N]);
        for (int i = 1; i < N; i++)
        {
            int j = pi[i - 1];
            // Find the longest proper prefix which is also a suffix
            while (j > 0 && S[i] != S[j])
                j = pi[j - 1];
            if (S[i] == S[j])
                j++;
            pi[i] = j;
        }
        return pi;
    }

    // Function to count the number of occurrences of a pattern in a text using KMP algorithm
    static int CountOccurrences(string S, string P)
    {
        // Concatenate pattern and text with a special character in between
        string combined = P + "#" + S;

        // Compute the prefix function
        List<int> prefixArray = ComputePrefix(combined);

        int count = 0;
        // Count the number of times the pattern appears in the text
        for (int i = 0; i < prefixArray.Count; i++)
        {
            if (prefixArray[i] == P.Length)
                count++;
        }

        return count;
    }

    // Driver code
    public static void Main(string[] args)
    {
        string S = "saippuakauppias";
        string P = "pp";

        Console.WriteLine(CountOccurrences(S, P));
    }
}
JavaScript
// Function to compute the prefix function of a string for
// KMP algorithm
function computePrefix(S) {
    let N = S.length;
    let pi = new Array(N).fill(0);
    for (let i = 1; i < N; i++) {
        let j = pi[i - 1];
        // Find the longest proper prefix which is also a
        // suffix
        while (j > 0 && S[i] != S[j])
            j = pi[j - 1];
        if (S[i] == S[j])
            j++;
        pi[i] = j;
    }
    return pi;
}

// Function to count the number of occurrences of a pattern
// in a text using KMP algorithm
function countOccurrences(S, P) {
    // Concatenate pattern and text with a special character
    // in between
    let combined = P + "#" + S;

    // Compute the prefix function
    let prefixArray = computePrefix(combined);

    let count = 0;
    // Count the number of times the pattern appears in the
    // text
    for (let i = 0; i < prefixArray.length; i++) {
        if (prefixArray[i] == P.length)
            count++;
    }

    return count;
}

// Driver code
let S = "saippuakauppias";
let P = "pp";

console.log(countOccurrences(S, P));

Output
2

Time Complexity: O(N+M) where N is the length of the text and M is the length of the pattern to be found.
Auxiliary Space: O(N)



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads