Given a string and several queries on the substrings of the given input string to check whether the substring is a palindrome or not.
Suppose our input string is “abaaabaaaba” and the queries- [0, 10], [5, 8], [2, 5], [5, 9]
We have to tell that the substring having the starting and ending indices as above is a palindrome or not.
[0, 10] → Substring is “abaaabaaaba” which is a palindrome.
[5, 8] → Substring is “baaa” which is not a palindrome.
[2, 5] → Substring is “aaab” which is not a palindrome.
[5, 9] → Substring is “baaab” which is a palindrome.
Let us assume that there are Q such queries to be answered and N be the length of our input string. There are the following two ways to answer these queries
One by one we go through all the substrings of the queries and check whether the substring under consideration is a palindrome or not.
Since there are Q queries and each query can take O(N) worse case time to answer, this method takes O(Q.N) time in the worst case. Although this is an in-place/space-efficient algorithm, but still there are more efficient method to do this.
The idea is similar to Rabin Karp string matching. We use string hashing. What we do is that we calculate cumulative hash values of the string in the original string as well as the reversed string in two arrays- prefix and suffix.
How to calculate the cumulative hash values ?
Suppose our string is str, then the cumulative hash function to fill our prefix array used is-
prefix = 0 prefix[i] = str + str * 101 + str * 1012 + ...... + str[i-1] * 101i-1 For example, take the string- “abaaabxyaba” prefix = 0 prefix = 97 (ASCII Value of ‘a’ is 97) prefix = 97 + 98 * 101 prefix = 97 + 98 * 101 + 97 * 1012 ........................... ........................... prefix = 97 + 98 * 101 + 97 * 1012 + ........+ 97 * 10110
Now the reason to store in that way is that we can easily find the hash value of any substring in O(1) time using-
hash(L, R) = prefix[R+1] – prefix[L]
For example, hash (1, 5) = hash (“baaab”) = prefix – prefix = 98 * 101 + 97 * 1012 + 97 * 1013 + 97 * 1014 + 98 * 1015 = 1040184646587 [We will use this weird value later to explain what’s happening].
Similar to this we will fill our suffix array as-
suffix = 0 suffix[i] = str[n-1] + str[n-2] * 101 + str[n-3] * 1012 + ...... + str[n-i] * 101i-1 For example, take the string- “abaaabxyaba” suffix = 0 suffix = 97 (ASCII Value of ‘a’ is 97) suffix = 97 + 98 * 101 suffix = 97 + 98 * 101 + 97 * 1012 ........................... ........................... suffix = 97 + 98 * 101 + 97 * 1012 + ........+ 97 * 10110
Now the reason to store in that way is that we can easily find the reverse hash value of any substring in O(1) time using
reverse_hash(L, R) = hash (R, L) = suffix[n-L] – suffix[n-R-1]
where n = length of string.
For “abaaabxyaba”, n = 11
reverse_hash(1,5) = reverse_hash(“baaab”) = hash(“baaab”) [Reversing “baaab” gives “baaab”]
hash(“baaab”) = suffix[11-1] – suffix[11-5-1] = suffix – suffix = 98 * 1015 + 97 * 1016 + 97 * 1017 + 97 * 1018 + 98 * 1019 = 108242031437886501387
Now there doesn’t seem to be any relation between these two weird integers – 1040184646587 and 108242031437886501387
Think again. Is there any relation between these two massive integers ?
Yes, there is and this observation is the core of this program/article.
1040184646587 * 1014 = 108242031437886501387
Try thinking about this and you will find that any substring starting at index- L and ending at index- R (both inclusive) will be a palindrome if
(prefix[R + 1] – prefix[L]) / (101L) = (suffix [n - L] – suffix [n – R- 1] ) / (101n – R - 1)
The rest part is just implementation.
The function computerPowers() in the program computes the powers of 101 using dynamic programming.
As, we can see that the hash values and the reverse hash values can become huge for even the small strings of length – 8. Since C and C++ doesn’t provide support for such large numbers, so it will cause overflows. To avoid this we will take modulo of a prime (a prime number is chosen for some specific mathematical reasons). We choose the biggest possible prime which fits in an integer value. The best such value is 1000000007. Hence all the operations are done modulo 1000000007.
However Java and Python has no such issues and can be implemented without the modulo operator.
The fundamental modulo operations which are used extensively in the program are listed below.
(a + b) %M = (a %M + b % M) % M
(a + b + c) % M = (a % M + b % M + c % M) % M
(a + b + c + d) % M = (a % M + b % M + c % M+ d% M) % M
…. ….. ….. ……
…. ….. ….. ……
(a * b) % M = (a * b) % M
(a * b * c) % M = ((a * b) % M * c % M) % M
(a * b * c * d) % M = ((((a * b) % M * c) % M) * d) % M
…. ….. ….. ……
…. ….. ….. ……
This property is used by modPow() function which computes power of a number modulo M
3) Mixture of addition and multiplication-
(a * x + b * y + c) % M = ( (a * x) % M +(b * y) % M+ c % M ) % M
(a – b) % M = (a % M – b % M + M) % M [Correct]
(a – b) % M = (a % M – b % M) % M [Wrong]
(a / b) % M = (a * MMI(b)) % M
Where MMI() is a function to calculate Modulo Multiplicative Inverse. In our program this is implemented by the function- findMMI().
The Substring [0 10] is a palindrome The Substring [5 8] is not a palindrome The Substring [2 5] is not a palindrome The Substring [5 9] is a palindrome
This article is contributed by Rachit Belwariar. If you like GeeksforGeeks and would like to contribute, you can also write an article and mail your article to firstname.lastname@example.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.
- Queries on substring palindrome formation
- Queries to check if substring[L...R] is palindrome or not
- Length of the longest substring that do not contain any palindrome
- Check if string can be rearranged so that every Odd length Substring is Palindrome
- Queries to check if the path between two nodes in a tree is a palindrome
- Queries to check if string B exists as substring in string A
- Length of the largest substring which have character with frequency greater than or equal to half of the substring
- Find if a given string can be represented from a substring by iterating the substring “n” times
- Partition given string in such manner that i'th substring is sum of (i-1)'th and (i-2)'th substring
- Sentence Palindrome (Palindrome after removing spaces, dots, .. etc)
- Count all palindrome which is square of a palindrome
- Substring Sort
- Check if a string is substring of another
- Frequency of a substring in a string
- Substring Reverse Pattern