Remove all occurrences of a string t in string s using Boyer-Moore Algorithm

Last Updated : 31 Jul, 2023

Given a string s and string t, the task is to remove all occurrences of a string t in a string s using the Boyer-Moore algorithm.

Examples:

Input: s = “ababaababa”, t = “aba”
Output: baab

Input: s = “Geeksforgeeks”, t = “eek”
Output: Gsforgs

Approach: This can be solved with the following idea:

• We initialize the bad character rule and then loop through s using the Boyer-Moore algorithm to find all occurrences of . When we find a match, we remove the matched substring from s.
• In the preBadChar() function we create an array called badChar[] that stores the index of the last occurrence of each character in t. This is called the “bad character rule”, and it allows the algorithm to skip over some characters in s that do not match t. We then define the removeOccurrences function.

Below are the steps involved in the implementation of the code:

• We first calculate the lengths of the input strings s and t and initialize the badChar array of size 256 with all elements set to -1. The badChar array will be used to store the index of the last occurrence of each character in the pattern t.
• We then call the preBadChar() function which populates the badChar array with the index of the last occurrence of each character in the pattern t.
• We initialize the variable i to 0, which will be used to keep track of the index of the start of the window of s that we will compare to the pattern t.
• We start a while loop that will continue until i is less than or equal to n – m, where n is the length of s and m is the length of t. This ensures that we don’t go past the end of the s string while searching for occurrences of t.
• Inside the while loop, we initialize the variable j to m – 1. This will be used to keep track of the index of the end of the window of s that we will compare to the pattern t. We start from the end of the pattern and compare it to the corresponding characters in s starting from index i + j.
• We then start another while loop which will compare the characters in the pattern t with the corresponding characters in s. We will continue to compare the characters until either we have matched all the characters in t with the corresponding characters in s, or until we find a character in t that doesn’t match the corresponding character in s.
• If we have matched all the characters in t, we have found an occurrence of t in s. We remove the matched substring from s using the erase() function and update the value of n to reflect the new length of s. We then update the value of i to skip over the matched substring.
• If we didn’t match all the characters in t, we have found a mismatch. We calculate the index of the bad character in s using the badChar array. If the bad character doesn’t exist in t, we can skip over the entire window of s. Otherwise, we move the window to the right by the maximum of 1 and j minus the index of the bad character in s. This allows us to skip over some unnecessary comparisons and potentially find the next occurrence of t faster.
• We repeat steps 5-8 until we have searched through the entire s string.
• Finally, the modified s string with all occurrences of t removed is returned.

Below is the implementation of the above approach:

C++

 `// C++ Implementation` `#include ` `#define NO_OF_CHARS 256` `using` `namespace` `std;`   `void` `preBadChar(string t, ``int` `m, ``int` `badChar[])` `{` `    ``int` `i;`   `    ``for` `(i = 0; i < NO_OF_CHARS; i++) {` `        ``badChar[i] = -1;` `    ``}`   `    ``for` `(i = 0; i < m; i++) {` `        ``badChar[(``int``)t[i]] = i;` `    ``}` `}`   `// Function to remove occurrence of string` `void` `removeOccurrences(string& s, string t)` `{`   `    ``int` `m = t.length();` `    ``int` `n = s.length();` `    ``int` `badChar[NO_OF_CHARS];`   `    ``preBadChar(t, m, badChar);`   `    ``int` `i = 0;` `    ``while` `(i <= n - m) {` `        ``int` `j = m - 1;`   `        ``while` `(j >= 0 && t[j] == s[i + j]) {` `            ``j--;` `        ``}`   `        ``if` `(j < 0) {` `            ``s.erase(i, m);` `            ``n = s.length();` `            ``i += m;` `        ``}` `        ``else` `{` `            ``i += max(1, j - badChar[s[i + j]]);` `        ``}` `    ``}` `}`   `// Driver code` `int` `main()` `{` `    ``string s = ``"Geeksforgeeks"``;` `    ``string t = ``"eek"``;`   `    ``// Function call` `    ``removeOccurrences(s, t);` `    ``cout << s << endl;` `    ``return` `0;` `}`

Java

 `import` `java.util.Arrays;`   `public` `class` `Main {` `    ``static` `final` `int` `NO_OF_CHARS = ``256``;`   `    ``// Function to pre-process the bad character array` `    ``static` `void` `preBadChar(String t, ``int` `m, ``int``[] badChar) {` `        ``// Initialize all occurrences as -1` `        ``Arrays.fill(badChar, -``1``);`   `        ``// Fill the actual value of last occurrence of a character` `        ``for` `(``int` `i = ``0``; i < m; i++) {` `            ``badChar[t.charAt(i)] = i;` `        ``}` `    ``}`   `    ``// Function to remove occurrences of the given pattern from the given string` `    ``static` `void` `removeOccurrences(StringBuilder s, String t) {` `        ``int` `m = t.length();` `        ``int` `n = s.length();` `        ``int``[] badChar = ``new` `int``[NO_OF_CHARS];`   `        ``preBadChar(t, m, badChar);`   `        ``int` `i = ``0``;` `        ``while` `(i <= n - m) {` `            ``int` `j = m - ``1``;`   `            ``// Keep reducing the index j of pattern while characters of pattern` `            ``// and string are matching at this shift s` `            ``while` `(j >= ``0` `&& t.charAt(j) == s.charAt(i + j)) {` `                ``j--;` `            ``}`   `            ``// If the pattern is present at current shift, then remove it` `            ``if` `(j < ``0``) {` `                ``s.delete(i, i + m);` `                ``n = s.length();` `                ``i += m;` `            ``}` `            ``else` `{` `                ``// Shift the pattern so that the bad character in text aligns with the last occurrence of it in pattern.` `                ``i += Math.max(``1``, j - badChar[s.charAt(i + j)]);` `            ``}` `        ``}` `    ``}`   `    ``// Driver code` `    ``public` `static` `void` `main(String[] args) {` `        ``StringBuilder s = ``new` `StringBuilder(``"Geeksforgeeks"``);` `        ``String t = ``"eek"``;`   `        ``// Function call` `        ``removeOccurrences(s, t);` `        ``System.out.println(s);` `    ``}` `}`

Python

 `# Function to precompute bad character table` `def` `preBadChar(t, m):` `    ``# Initialize an array to store bad character position` `    ``badChar ``=` `[``-``1``]``*``256`   `    ``# Fill the array with the last occurrence of each character in t` `    ``for` `i ``in` `range``(m):` `        ``badChar[``ord``(t[i])] ``=` `i`   `    ``return` `badChar`   `# Function to remove all occurrences of t in s` `def` `removeOccurrences(s, t):` `    ``m ``=` `len``(t)` `    ``n ``=` `len``(s)` `    `  `    ``# Precompute the bad character table` `    ``badChar ``=` `preBadChar(t, m)` `    `  `    ``i ``=` `0` `    ``while` `i <``=` `n ``-` `m:` `        ``j ``=` `m ``-` `1`   `        ``# Compare characters from right to left` `        ``while` `j >``=` `0` `and` `t[j] ``=``=` `s[i``+``j]:` `            ``j ``-``=` `1`   `        ``if` `j < ``0``:` `            ``# If pattern is found, remove it from s` `            ``s ``=` `s[:i] ``+` `s[i``+``m:]` `            ``n ``=` `len``(s)` `            ``i ``+``=` `m` `        ``else``:` `            ``# Shift the pattern so that bad character aligns` `            ``i ``+``=` `max``(``1``, j ``-` `badChar[``ord``(s[i``+``j])])`   `    ``return` `s`   `# Driver code` `s ``=` `"Geeksforgeeks"` `t ``=` `"eek"`   `# Function call` `s ``=` `removeOccurrences(s, t)` `print``(s)`

C#

 `// C# code to implement the above approach.` `using` `System;`   `public` `class` `GFG` `{` `    ``const` `int` `NO_OF_CHARS = 256;`   `    ``// Function to pre-process the bad character array` `    ``static` `void` `PreBadChar(``string` `t, ``int` `m, ``int``[] badChar)` `    ``{` `        ``// Initialize all occurrences as -1` `        ``Array.Fill(badChar, -1);`   `        ``// Fill the actual value of last occurrence of a character` `        ``for` `(``int` `i = 0; i < m; i++)` `        ``{` `            ``badChar[(``int``)t[i]] = i;` `        ``}` `    ``}`   `    ``// Function to remove occurrences of the given pattern ` `    ``// from the given string` `    ``static` `void` `RemoveOccurrences(``ref` `string` `s, ``string` `t)` `    ``{` `        ``int` `m = t.Length;` `        ``int` `n = s.Length;` `        ``int``[] badChar = ``new` `int``[NO_OF_CHARS];`   `        ``PreBadChar(t, m, badChar);`   `        ``int` `i = 0;` `        ``while` `(i <= n - m)` `        ``{` `            ``int` `j = m - 1;`   `            ``// Keep reducing the index j of pattern while ` `           ``// characters of pattern` `          ``// and string are matching at this shift s` `            ``while` `(j >= 0 && t[j] == s[i + j])` `            ``{` `                ``j--;` `            ``}`   `            ``// If the pattern is present at current shift, ` `           ``// then remove it` `            ``if` `(j < 0)` `            ``{` `                ``s = s.Remove(i, m);` `                ``n = s.Length;` `                ``i += m;` `            ``}` `            ``else` `            ``{` `                ``// Shift the pattern so that the bad character ` `               ``// in text aligns with the last occurrence of it in ` `              ``// pattern.` `                ``i += Math.Max(1, j - badChar[(``int``)s[i + j]]);` `            ``}` `        ``}` `    ``}`   `    ``// Driver code` `    ``public` `static` `void` `Main(``string``[] args)` `    ``{` `        ``string` `s = ``"Geeksforgeeks"``;` `        ``string` `t = ``"eek"``;`   `        ``// Function call` `        ``RemoveOccurrences(``ref` `s, t);` `        ``Console.WriteLine(s);` `    ``}` `}`     `// This code is contributed by Vaibhav Nandan`

Javascript

 `// Function to pre-process the bad character array` `function` `preBadChar(t, m) {` `    `  `    ``// Initialize all occurrences as -1` `    ``var` `badChar = ``new` `Array(256).fill(-1);` `    `  `    ``// Fill the actual value of last occurrence of a character` `    ``for` `(``var` `i = 0; i < m; i++) {` `        ``badChar[t.charCodeAt(i)] = i;` `    ``}` `    ``return` `badChar;` `}`   `// Function to remove occurrences of the given pattern from the given string` `function` `removeOccurrences(s, t) {` `    ``var` `m = t.length;` `    ``var` `n = s.length;` `    `  `    ``var` `badChar = preBadChar(t, m);` `    ``var` `i = 0;` `    ``while` `(i <= n - m) {` `        ``var` `j = m - 1;` `        `  `        `  `        ``// Keep reducing the index j of pattern while characters of pattern` `        ``// and string are matching at this shift s` `        ``while` `(j >= 0 && t[j] == s[i+j]) {` `            ``j -= 1;` `        ``}` `        `  `        ``// If the pattern is present at current shift, then remove it` `        ``if` `(j < 0) {` `            ``s = s.slice(0, i) + s.slice(i+m);` `            ``n = s.length;` `            ``i += m;` `        ``}` `        ``// Shift the pattern so that the bad character in text aligns with` `        ``// the last occurrence of it in pattern.` `        ``else` `{` `            ``i += Math.max(1, j - badChar[s.charCodeAt(i+j)]);` `        ``}` `    ``}` `    ``return` `s;` `}`   `// Test case` `var` `s = ``"Geeksforgeeks"``;` `var` `t = ``"eek"``;`   `s = removeOccurrences(s, t);` `console.log(s);`

PHP

 `= 0 && ``\$t``[``\$j``] == ``\$s``[``\$i` `+ ``\$j``]) {` `            ``\$j``--;` `        ``}`   `        ``if` `(``\$j` `< 0) {` `            ``// If pattern occurs, remove it` `            ``\$s` `= substr_replace(``\$s``, ``''``, ``\$i``, ``\$m``);` `            ``\$n` `= ``strlen``(``\$s``);` `            ``\$i` `+= ``\$m``;` `        ``}` `        ``else` `{` `            ``// Shift the pattern so that the bad character in text aligns with the last occurrence of it in pattern` `            ``\$i` `+= max(1, ``\$j` `- ``\$badChar``[ord(``\$s``[``\$i` `+ ``\$j``])]);` `        ``}` `    ``}` `}`   `// Driver code` `\$s` `= ``"Geeksforgeeks"``;` `\$t` `= ``"eek"``;`   `// Function call` `removeOccurrences(``\$s``, ``\$t``);` `echo` `\$s` `. ``"\n"``;`   `?>`

Output

```Gsforgs

```

Time Complexity: O(mn)
Auxiliary Space: O(k)