Implementation of Wu Manber Algorithm?

Last Updated : 10 Jan, 2023

What is Wu- Manber Algorithm?

The Wu-Manber algorithm is a string-matching algorithm that is used to efficiently search for patterns in a body of text. It is a hybrid algorithm that combines the strengths of the Boyer-Moore and Knuth-Morris-Pratt algorithms to provide fast and accurate pattern matching.

Illustration:

Example: s = “the quick brown fox jumps over the lazy dog” pattern = “brown”:

Step 1: Divide the pattern into two subpatterns let’s say “br” and “own“.

Step 2: Next step includes calculating hash values for each subpattern formed in step 1.

Step 3: Start iterating in s from the first character.

Step 4: If one subpattern matches the substring in s like “br” matches “brown” substring in s.

Step 5: Then will check, whether the whole pattern is matching that substring or not.

Step 6: The whole pattern is matching the substring found in string s. It will return the index of the substring indicating pattern matched.

Step 7: If let’s say it doesn’t match, it will search for another substring in s. If not found return “no match was found”.

Steps involved in Wu-Manber Algorithm:

Create a hash table that maps each possible substring of the pattern to the positions in the pattern where that substring appears.
This hash table is used to quickly identify the potential starting positions of the pattern in the text.
Iterate through the text and compare each character to the corresponding character in the pattern.
- If the characters match, you can move to the next character and continue the comparison.
- If the characters do not match, you can use the hash table to determine the maximum number of characters that can be skipped before the next potential starting position of the pattern.
This allows the algorithm to quickly skip over large sections of the text without missing any potential matches.

Below is the code to implement the above approach:

C++14

// C++ code to implement the above approach
#include <bits/stdc++.h>
using namespace std;
 
// Generate a hash for each subpattern
int HashPattern(string& pattern, int i, int j)
{
    int h = 0;
    for (int k = i; k < j; k++) {
        h = h * 256 + ((int)pattern[k] - 'a');
    }
    return h;
}
 
// Wu Manber algorithm
void WuManber(string& text, string& pattern)
{
    // Define the length of the pattern and text
    int m = pattern.length();
    int n = text.length();
 
    // Define the number of subpatterns to use
    int s = 2;
 
    // Define the length of each subpattern
    int t = m / s;
 
    // Initialize the hash values for each subpattern
    int h[s];
    for (int i = 0; i < s; i++) {
        h[i] = HashPattern(pattern, i * t, (i + 1) * t);
    }
 
    // Initialize the shift value for each subpattern
    int shift[s];
    for (int i = 0; i < s; i++) {
        shift[i] = t * (s - i - 1);
    }
 
    // Initialize the match value
    bool match = false;
 
    // Iterate through the text
    for (int i = 0; i < n - m + 1; i++) {
        // Check if the subpatterns match
        bool subpatternsMatch = true;
        int j;
 
        for (j = 0; j < s; j++) {
            if (HashPattern(text, i + j * t,
                            i + (j + 1) * t)
                != h[j]) {
                subpatternsMatch = false;
                break;
            }
        }
 
        if (subpatternsMatch) {
            // If the subpatterns match, check if the
            // full pattern matches
            if (text.substr(i, m) == pattern) {
                cout << "Match found at index " << i
                     << endl;
                match = true;
            }
        }
 
        // Shift the pattern by the appropriate amount
        bool shouldShift = true;
        for (j = 0; j < s; j++) {
            if (i + shift[j] < n - m + 1) {
                shouldShift = false;
                break;
            }
        }
 
        if (shouldShift) {
            i += shift[j];
        }
    }
 
    // If no match was found, print a message
    if (!match) {
        cout << "No match found \n";
    }
}
 
int main()
{
 
    // Code
    string text = "the cat sat on the mat";
    string pattern = "the";
 
    WuManber(text, pattern);
    return 0;
}

Java

// Java code to implement the above approach
import java.io.*;
import java.util.*;
 
class GFG {
 
  // Generate a hash for each subpattern
  static int hashPattern(String pattern, int i, int j)
  {
    int h = 0;
    for (int k = i; k < j; k++) {
      h = h * 256 + (int)pattern.charAt(k);
    }
    return h;
  }
 
  // Wu Manber algorithm
  static void wuManber(String text, String pattern)
  {
    // Define the length of the pattern and text
    int m = pattern.length();
    int n = text.length();
 
    // Define the number of subpatterns to use
    int s = 2;
 
    // Define the length of each subpattern
    int t = m / s;
 
    // Initialize the hash values for each subpattern
    int[] h = new int[s];
    for (int i = 0; i < s; i++) {
      h[i] = hashPattern(pattern, i * t, (i + 1) * t);
    }
 
    // Initialize the shift value for each subpattern
    int[] shift = new int[s];
    for (int i = 0; i < s; i++) {
      shift[i] = t * (s - i - 1);
    }
 
    // Initialize the match value
    boolean match = false;
 
    // Iterate through the text
    // Iterate through the text
    for (int i = 0; i < n - m + 1; i++) {
      // Check if the subpatterns match
      boolean subpatternsMatch = true;
      int j;
 
      for (j = 0; j < s; j++) {
        if (hashPattern(text, i + j * t,
                        i + (j + 1) * t)
            != h[j]) {
          subpatternsMatch = false;
          break;
        }
      }
 
      if (subpatternsMatch) {
        // If the subpatterns match, check if the
        // full pattern matches
        if (text.substring(i, i + m).equals(
          pattern)) {
          System.out.println(
            "Match found at index " + i);
          match = true;
        }
      }
 
      // Shift the pattern by the appropriate amount
      boolean shouldShift = true;
      for (j = 0; j < s; j++) {
        if (i + shift[j] < n - m + 1) {
          shouldShift = false;
          break;
        }
      }
 
      if (shouldShift) {
        i += shift[j];
      }
    }
 
    // If no match was found, print a message
    if (!match) {
      System.out.println("No match found");
    }
  }
 
  public static void main(String[] args)
  {
    String text = "the cat sat on the mat";
    String pattern = "the";
 
    wuManber(text, pattern);
  }
}
 
// This code is contributed by lokesh.

Python3

# Define the hash_pattern() function to generate
# a hash for each subpattern
 
 
def hashPattern(pattern, i, j):
    h = 0
    for k in range(i, j):
        h = h * 256 + ord(pattern[k])
    return h
 
# Define the Wu Manber algorithm
 
 
def wuManber(text, pattern):
 
    # Define the length of the pattern and
    # text
    m = len(pattern)
    n = len(text)
 
    # Define the number of subpatterns to use
    s = 2
 
    # Define the length of each subpattern
    t = m // s
 
    # Initialize the hash values for each
    # subpattern
    h = [0] * s
    for i in range(s):
        h[i] = hashPattern(pattern, i * t, (i + 1) * t)
 
    # Initialize the shift value for each
    # subpattern
    shift = [0] * s
    for i in range(s):
        shift[i] = t * (s - i - 1)
 
    # Initialize the match value
    match = False
 
    # Iterate through the text
    for i in range(n - m + 1):
        # Check if the subpatterns match
        for j in range(s):
            if hashPattern(text, i + j * t, i + (j + 1) * t) != h[j]:
                break
        else:
            # If the subpatterns match, check if
            # the full pattern matches
            if text[i:i + m] == pattern:
                print("Match found at index", i)
                match = True
 
        # Shift the pattern by the appropriate
        # amount
        for j in range(s):
            if i + shift[j] < n - m + 1:
                break
        else:
            i += shift[j]
 
    # If no match was found, print a message
    if not match:
        print("No match found")
 
 
# Driver Code
text = "the cat sat on the mat"
pattern = "the"
 
# Function call
wuManber(text, pattern)

C#

// C# code to implement the above approach
using System;
using System.Collections.Generic;
 
public class GFG {
 
  // Generate a hash for each subpattern
  static int HashPattern(string pattern, int i, int j)
  {
    int h = 0;
    for (int k = i; k < j; k++) {
      h = h * 256 + (int)pattern[k];
    }
    return h;
  }
 
  // Wu Manber algorithm
  static void WuManber(string text, string pattern)
  {
    // Define the length of the pattern and text
    int m = pattern.Length;
    int n = text.Length;
 
    // Define the number of subpatterns to use
    int s = 2;
 
    // Define the length of each subpattern
    int t = m / s;
 
    // Initialize the hash values for each subpattern
    int[] h = new int[s];
    for (int i = 0; i < s; i++) {
      h[i] = HashPattern(pattern, i * t, (i + 1) * t);
    }
 
    // Initialize the shift value for each subpattern
    int[] shift = new int[s];
    for (int i = 0; i < s; i++) {
      shift[i] = t * (s - i - 1);
    }
 
    // Initialize the match value
    bool match = false;
 
    // Iterate through the text
    for (int i = 0; i < n - m + 1; i++) {
      // Check if the subpatterns match
      bool subpatternsMatch = true;
      int j;
 
      for (j = 0; j < s; j++) {
        if (HashPattern(text, i + j * t,
                        i + (j + 1) * t)
            != h[j]) {
          subpatternsMatch = false;
          break;
        }
      }
 
      if (subpatternsMatch) {
        // If the subpatterns match, check if the
        // full pattern matches
        if (text.Substring(i, m).Equals(pattern)) {
          Console.WriteLine(
            "Match found at index " + i);
          match = true;
        }
      }
 
      // Shift the pattern by the appropriate amount
      bool shouldShift = true;
      for (j = 0; j < s; j++) {
        if (i + shift[j] < n - m + 1) {
          shouldShift = false;
          break;
        }
      }
 
      if (shouldShift) {
        i += shift[j];
      }
    }
 
    // If no match was found, print a message
    if (!match) {
      Console.WriteLine("No match found");
    }
  }
 
  static public void Main()
  {
 
    // Code
    string text = "the cat sat on the mat";
    string pattern = "the";
 
    WuManber(text, pattern);
  }
}
 
// This code is contributed by lokeshmvs21.

Javascript

  // JS code to implement the approach
 
  // Define the hash_pattern() function to generate
  // a hash for each subpattern
 
  function hashPattern(pattern, i, j) {
    let h = 0
    for (let k = i; k < j; k++)
      h = h * 256 + (pattern[k]).charCodeAt(0)
    return h
  }
   
  // Define the Wu Manber algorithm
  function wuManber(text, pattern) {
 
    // Define the length of the pattern and
    // text
    let m = pattern.length
    let n = text.length
 
    // Define the number of subpatterns to use
    let s = 2
 
    // Define the length of each subpattern
    let t = Math.floor(m / s)
 
    // Initialize the hash values for each
    // subpattern
    let h = new Array(s).fill(0)
    for (let i = 0; i < s; i++)
      h[i] = hashPattern(pattern, i * t, (i + 1) * t)
 
    // Initialize the shift value for each
    // subpattern
    let shift = new Array(s).fill(0)
    for (let i = 0; i < s; i++)
      shift[i] = t * (s - i - 1)
 
    // Initialize the match value
    let match = false
 
    // Iterate through the text
    for (let i = 0; i < (n - m + 1); i++) {
      // Check if the subpatterns match
      for (let j = 0; j < s; j++) {
        if (hashPattern(text, i + j * t, i + (j + 1) * t) != h[j])
          break
      }
      // If the subpatterns match, check if
      // the full pattern matches
      if (text.slice(i, i + m) == pattern) {
        console.log("Match found at index" + i + "<br>")
        match = true
      }
 
 
      // Shift the pattern by the appropriate
      // amount
      for (let j = 0; j < s; j++) {
        if (i + shift[j] < n - m + 1)
          break
        else
          i += shift[j]
      }
    }
    // If no match was found, document.write a message
    if (!match)
      console.log("No match found")
  }
 
  // Driver Code
  let text = "the cat sat on the mat"
  let pattern = "the"
 
  // Function call
  wuManber(text, pattern)
   
// This code is contributed by Potta Lokesh

Output

Match found at index 0
Match found at index 15

Time complexity: O(n + m)
Auxiliary Space: O (n+m)

Difference between KMP and Wu-Manber Algorithms?

KMP algorithm and Wu Manber algorithm are both string-matching algorithms, which means that they are used to find a substring within a larger string. Both algorithms have the same time complexity, which means that they have the same performance characteristics in terms of how long it takes for the algorithm to run.

However, there are some differences between them:

KMP algorithm uses a preprocessing step to generate a partial match table, which is used to speed up the string-matching process. This makes the KMP algorithm more efficient than the Wu Manber algorithm when the pattern that is being searched for is relatively long.
Wu Manber algorithm uses a different approach to string matching, which involves dividing the pattern into several subpatterns and using these subpatterns to search for matches in the text. This makes the Wu Manber algorithm more efficient than the KMP algorithm when the pattern that is being searched for is relatively short.