Searching for Patterns | Set 4 (A Naive Pattern Searching Question)

Question: We have discussed Naive String matching algorithm here. Consider a situation where all characters of pattern are different. Can we modify the original Naive String Matching algorithm so that it works better for these types of patterns. If we can, then what are the changes to original algorithm?

Solution: In the original Naive String matching algorithm , we always slide the pattern by 1. When all characters of pattern are different, we can slide the pattern by more than 1. Let us see how can we do this. When a mismatch occurs after j matches, we know that the first character of pattern will not match the j matched characters because all characters of pattern are different. So we can always slide the pattern by j without missing any valid shifts. Following is the modified code that is optimized for the special patterns.

C

/* C program for A modified Naive Pattern Searching
  algorithm that is optimized for the cases when all
  characters of pattern are different */
#include<stdio.h>
#include<string.h>

/* A modified Naive Pettern Searching algorithn that is optimized
   for the cases when all characters of pattern are different */
void search(char pat[], char txt[])
{
    int M = strlen(pat);
    int N = strlen(txt);
    int i = 0;

    while (i <= N - M)
    {
        int j;

        /* For current index i, check for pattern match */
        for (j = 0; j < M; j++)
            if (txt[i+j] != pat[j])
                break;

        if (j == M)  // if pat[0...M-1] = txt[i, i+1, ...i+M-1]
        {
           printf("Pattern found at index %d \n", i);
           i = i + M;
        }
        else if (j == 0)
           i = i + 1;
        else
           i = i + j; // slide the pattern by j
    }
}

/* Driver program to test above function */
int main()
{
   char txt[] = "ABCEABCDABCEABCD";
   char pat[] = "ABCD";
   search(pat, txt);
   return 0;
}

Python

# Python program for A modified Naive Pattern Searching
# algorithm that is optimized for the cases when all
# characters of pattern are different
def search(pat, txt):
    M = len(pat)
    N = len(txt)
    i = 0

    while i <= N-M:
        # For current index i, check for pattern match
        for j in xrange(M):
            if txt[i+j] != pat[j]:
                break
            j += 1

        if j==M:    # if pat[0...M-1] = txt[i,i+1,...i+M-1]
            print "Pattern found at index " + str(i)
            i = i + M
        elif j==0:
            i = i + 1
        else:
            i = i+ j    # slide the pattern by j

# Driver program to test the above function
txt = "ABCEABCDABCEABCD"
pat = "ABCD"
search(pat, txt)

# This code is contributed by Bhavya Jain

PHP

<?php
// PHP program for A modified Naive 
// Pattern Searching algorithm that
// is optimized for the cases when all
// characters of pattern are different 

/* A modified Naive Pettern Searching
   algorithn that is optimized for the
   cases when all characters of 
   pattern are different */
function search($pat, $txt)
{
    $M = strlen($pat);
    $N = strlen($txt);
    $i = 0;

    while ($i <= $N - $M)
    {
        $j;

        /* For current index i, 
           check for pattern match */
        for ($j = 0; $j < $M; $j++)
            if ($txt[$i + $j] != $pat[$j])
                break;

        // if pat[0...M-1] = 
        // txt[i, i+1, ...i+M-1]
        if ($j == $M) 
        {
            echo("Pattern found at index $i"."\n" );
            $i = $i + $M;
        }
        else if ($j == 0)
            $i = $i + 1;
        else
        
            // slide the pattern by j
            $i = $i + $j; 
    }
}

// Driver Code
$txt = "ABCEABCDABCEABCD";
$pat = "ABCD";
search($pat, $txt);

// This code is contributed by nitin mittal.
?>


Output:

Pattern found at index 4
Pattern found at index 12

Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.






Practice Tags :
Article Tags :
Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.

Recommended Posts:



2.1 Average Difficulty : 2.1/5.0
Based on 39 vote(s)