C Program for Rabin-Karp Algorithm for Pattern Searching

Last Updated : 13 Feb, 2023

Given a text txt[0..n-1] and a pattern pat[0..m-1], write a function search(char pat[], char txt[]) that prints all occurrences of pat[] in txt[]. You may assume that n > m.

Examples:

Input:  txt[] = "THIS IS A TEST TEXT"
        pat[] = "TEST"
Output: Pattern found at index 10

Input:  txt[] =  "AABAACAADAABAABA"
        pat[] =  "AABA"
Output: Pattern found at index 0
        Pattern found at index 9
        Pattern found at index 12

The Naive String Matching algorithm slides the pattern one by one. After each slide, it one by one checks characters at the current shift and if all characters match then prints the match.
Like the Naive Algorithm, Rabin-Karp algorithm also slides the pattern one by one. But unlike the Naive algorithm, Rabin Karp algorithm matches the hash value of the pattern with the hash value of current substring of text, and if the hash values match then only it starts matching individual characters. So Rabin Karp algorithm needs to calculate hash values for following strings.

1) Pattern itself.
2) All the substrings of text of length m.

C/C++

/* Following program is a C implementation of Rabin Karp 
Algorithm given in the CLRS book */
#include <stdio.h> 
#include <string.h> 
  
// d is the number of characters in the input alphabet 
#define d 256 
  
/* pat -> pattern 
    txt -> text 
    q -> A prime number 
*/
void search(char pat[], char txt[], int q) 
{ 
    int M = strlen(pat); 
    int N = strlen(txt); 
    int i, j; 
    int p = 0; // hash value for pattern 
    int t = 0; // hash value for txt 
    int h = 1; 
  
    // The value of h would be "pow(d, M-1)%q" 
    for (i = 0; i < M - 1; i++) 
        h = (h * d) % q; 
  
    // Calculate the hash value of pattern and first 
    // window of text 
    for (i = 0; i < M; i++) { 
        p = (d * p + pat[i]) % q; 
        t = (d * t + txt[i]) % q; 
    } 
  
    // Slide the pattern over text one by one 
    for (i = 0; i <= N - M; i++) { 
  
        // Check the hash values of current window of text 
        // and pattern. If the hash values match then only 
        // check for characters one by one 
        if (p == t) { 
            /* Check for characters one by one */
            for (j = 0; j < M; j++) { 
                if (txt[i + j] != pat[j]) 
                    break; 
            } 
  
            // if p == t and pat[0...M-1] = txt[i, i+1, ...i+M-1] 
            if (j == M) 
                printf("Pattern found at index %d \n", i); 
        } 
  
        // Calculate hash value for next window of text: Remove 
        // leading digit, add trailing digit 
        if (i < N - M) { 
            t = (d * (t - txt[i] * h) + txt[i + M]) % q; 
  
            // We might get negative value of t, converting it 
            // to positive 
            if (t < 0) 
                t = (t + q); 
        } 
    } 
} 
  
/* Driver program to test above function */
int main() 
{ 
    char txt[] = "GEEKS FOR GEEKS"; 
    char pat[] = "GEEK"; 
    int q = 101; // A prime number 
    search(pat, txt, q); 
    return 0; 
} 

Output:

Pattern found at index 0 
Pattern found at index 10

Please refer complete article on Rabin-Karp Algorithm for Pattern Searching for more details!

Suggest improvement

C Program for Find largest prime factor of a number

C/C++ Program for Longest Increasing Subsequence

Share your thoughts in the comments

C Program for Rabin-Karp Algorithm for Pattern Searching

C/C++

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?