Open In App

Java Program to Implement Commentz-Walter Algorithm

Last Updated : 21 Dec, 2022
Improve
Improve
Like Article
Like
Save
Share
Report

The Commentz-Walter algorithm is a string-matching algorithm that is used to search for a given pattern in a text string. 

  • It is a variant of the Knuth-Morris-Pratt (KMP) algorithm, which is a well-known algorithm for string matching. 
  • This algorithm works by preprocessing the pattern string to create a failure function that encodes the information about the positions of the possible suffixes in the pattern. 
  • This failure function is then used to efficiently skip over parts of the text string that cannot match the pattern, allowing the algorithm to quickly find the pattern in the text string. 
  • The Commentz-Walter algorithm is known for its simplicity and efficiency. 

Examples:

Input: str = “Hello, World! How are you?”,  pattern = “World”
Output: Pattern found at index 7

Input: str = “Hello, World! How are you?”,  pattern = “you”
Output: Pattern found at index 22

Applications of Commentz-Walter Algorithm: 

The Commentz-Walter algorithm is a string-matching algorithm that can be used to search for a pattern in a text string. Some possible applications of this algorithm include:

  • Searching for a specific word or phrase in a document or text file
  • Searching for a specific string in a large database of strings
  • Searching for a specific pattern in a sequence of DNA or protein
  • Searching for a specific pattern in a network traffic stream to detect anomalies or malicious activity
  • Searching for a specific pattern in a signal to detect specific events or patterns

These are just a few examples of how the Commentz-Walter algorithm can be used. It can be applied to any situation where it is necessary to search for a pattern the in a text string.

Follow the below steps to solve the algorithm:

The Commentz-Walter is a string-matching algorithm that uses a failure function to skip over parts of the text string that cannot match the pattern string. This can be a faster and more efficient way to search for a pattern in a text string compared to other string-matching algorithms.

Here is an example of how the Commentz-Walter algorithm works:

Let’s say we have the following text string: Hello, World! How are you? and we want to search for the pattern string World in the text string

  • First, Preprocess the pattern string to create the failure function. This involves initializing the pointers for the pattern string and looping through the pattern to create the failure function. In this case, the failure function will be [0, 0, 0, 0, 0].
  • Next, Initialize the pointers for the text and pattern strings and start looping through the text string. The first character of the text string ‘H’ does not match the first character of the pattern string ‘W’, so we use the failure function to skip over part of the text string that cannot match the pattern string. The failure function tells us to move the pointer for the text string to the next character, so we move from ‘H’ to ‘e’
  • The next character of the text string e also does not match the first character of the pattern string  ‘W’, so we again use the failure function to move the pointer for the text string to the next character. This continues until we reach character  ‘W’ in the text string, which matches the first character of the pattern string.
  • At this point, the pointers for both the text and pattern strings are incremented and we move to the next characters in both strings. The next character of the text string ‘o’ matches the next character of the pattern string ‘o’, so the pointers are incremented again. This continues until we reach the end of the pattern string, at which point we know that the pattern string has been found in the text string.
  • Finally, we return the starting index of the match, which is 6 in this case. The search is complete and we have successfully found the pattern string World in the text string Hello, World! How are you?

Below is the Implementation of the above approach:

Java




import java.io.*;
public class CommentzWalter {
    // Function to find the pattern in the text string using
    // the Commentz-Walter algorithm
    public static int search(String text, String pattern)
    {
        // Preprocess the pattern string to create the
        // failure function
        int[] failure = createFailureFunction(pattern);
  
        // Initialize the pointers for the text and pattern
        // strings
        int i = 0;
        int j = 0;
  
        // Loop through the text string
        while (i < text.length()) {
  
            // Check if the current character of the text
            // string matches the current character of the
            // pattern string
            if (text.charAt(i) == pattern.charAt(j)) {
  
                // If the characters match, increment the
                // pointers for both strings
                i++;
                j++;
            }
  
            // If the pattern string has been matched,
            // return the starting index of the match
            if (j == pattern.length()) {
                return i - j;
            }
  
            // If the current character of the text string
            // does not match the current character of the
            // pattern string, use the failure function to
            // skip over part of the text string that cannot
            // match the pattern
            if (i < text.length()
                && text.charAt(i) != pattern.charAt(j)) {
                if (j > 0) {
                    j = failure[j - 1];
                }
                else {
                    i++;
                }
            }
        }
  
        // If the pattern string was not found
        // in the text string, return -1
        return -1;
    }
  
    // Function to preprocess the pattern
    // string and create the failure function
    public static int[] createFailureFunction(
        String pattern)
    {
        int[] failure = new int[pattern.length()];
  
        // Initialize the pointers for the
        // pattern string
        int i = 0;
        int j = 1;
  
        // Loop through the pattern string
        while (j < pattern.length()) {
  
            // If the current character of the pattern
            // string matches the previous character,
            // increment the pointers for both strings
            if (pattern.charAt(i) == pattern.charAt(j)) {
                failure[j] = i + 1;
                i++;
                j++;
            }
            else {
  
                // If the characters do not match, use the
                // failure function to skip over part of the
                // pattern string that cannot match the text
                // string
                if (i > 0) {
                    i = failure[i - 1];
                }
                else {
                    failure[j] = 0;
                    j++;
                }
            }
        }
  
        return failure;
    }
  
    public static void main(String[] args)
    {
  
        // Text string
        String text = "Hello, World! How are you?";
  
        // Pattern string
        String pattern = "World";
  
        // Search for the pattern in the text
        // string using the Commentz-Walter algorithm
        int index = search(text, pattern);
  
        // Print the result
        if (index >= 0) {
            System.out.println("Pattern found at index "
                               + index);
        }
        else {
            System.out.println(
                "Pattern not found in the text string");
        }
    }
}


Output

Pattern found at index 7

Time complexity: O(M+N), where M is the length of the pattern and N is the length of the text string. 
Auxiliary Space: O(N), where N is the length of the pattern string

Related Articles:



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads