PHP Program for Rabin-Karp Algorithm for Pattern Searching

Given a text txt[0..n-1] and a pattern pat[0..m-1], write a function search(char pat[], char txt[]) that prints all occurrences of pat[] in txt[]. You may assume that n > m.

Examples:

Input:  txt[] = "THIS IS A TEST TEXT"
        pat[] = "TEST"
Output: Pattern found at index 10

Input:  txt[] =  "AABAACAADAABAABA"
        pat[] =  "AABA"
Output: Pattern found at index 0
        Pattern found at index 9
        Pattern found at index 12
pattern-searching


The Naive String Matching algorithm slides the pattern one by one. After each slide, it one by one checks characters at the current shift and if all characters match then prints the match.
Like the Naive Algorithm, Rabin-Karp algorithm also slides the pattern one by one. But unlike the Naive algorithm, Rabin Karp algorithm matches the hash value of the pattern with the hash value of current substring of text, and if the hash values match then only it starts matching individual characters. So Rabin Karp algorithm needs to calculate hash values for following strings.



1) Pattern itself.
2) All the substrings of text of length m.

PHP

filter_none

edit
close

play_arrow

link
brightness_4
code

<?php
// Following program is a PHP 
// implementation of Rabin Karp
// Algorithm given in the CLRS book 
  
// d is the number of characters
// in the input alphabet
$d = 256;
  
/* pat -> pattern
   txt -> text
   q -> A prime number
*/
function search($pat, $txt, $q)
{
    $M = strlen($pat);
    $N = strlen($txt);
    $i; $j;
    $p = 0; // hash value 
            // for pattern
    $t = 0; // hash value 
            // for txt
    $h = 1;
    $d =1;
  
    // The value of h would
    // be "pow(d, M-1)%q"
    for ($i = 0; $i < $M - 1; $i++)
        $h = ($h * $d) % $q;
  
    // Calculate the hash value
    // of pattern and first
    // window of text
    for ($i = 0; $i < $M; $i++)
    {
        $p = ($d * $p + $pat[$i]) % $q;
        $t = ($d * $t + $txt[$i]) % $q;
    }
  
    // Slide the pattern over
    // text one by one
    for ($i = 0; $i <= $N - $M; $i++)
    {
  
        // Check the hash values of 
        // current window of text
        // and pattern. If the hash
        // values match then only
        // check for characters on
        // by one
        if ($p == $t)
        {
            // Check for characters
            // one by one
            for ($j = 0; $j < $M; $j++)
            {
                if ($txt[$i + $j] != $pat[$j])
                    break;
            }
  
            // if p == t and pat[0...M-1] = 
            // txt[i, i+1, ...i+M-1]
            if ($j == $M)
                echo "Pattern found at index ",
                                      $i, "\n";
        }
  
        // Calculate hash value for 
        // next window of text: 
        // Remove leading digit,
        // add trailing digit
        if ($i < $N - $M)
        {
            $t = ($d * ($t - $txt[$i] * 
                        $h) + $txt[$i
                             $M]) % $q;
  
            // We might get negative 
            // value of t, converting
            // it to positive
            if ($t < 0)
            $t = ($t + $q);
        }
    }
}
  
// Driver Code
$txt = "GEEKS FOR GEEKS";
$pat = "GEEK";
$q = 101; // A prime number
search($pat, $txt, $q);
  
// This code is contributed
// by ajit
?>

chevron_right


Output:

Pattern found at index 0
Pattern found at index 10

Please refer complete article on Rabin-Karp Algorithm for Pattern Searching for more details!



My Personal Notes arrow_drop_up


Article Tags :

Be the First to upvote.


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.