Given a string, count the number of distinct substrings using Rabin Karp Algorithm.
Examples:
Input : str = “aba” Output : 5 Explanation : Total number of distinct substring are 5  "a", "ab", "aba", "b" ,"ba" Input : str = “abcd” Output : 10 Explanation : Total number of distinct substring are 10  "a", "ab", "abc", "abcd", "b", "bc", "bcd", "c", "cd", "d"
Approach:
Prerequisite: RabinKarp Algorithm for Pattern Searching
Calculate the current hash value of the current character and store
in a dictionary/map to avoid repetition.
To compute the hash (rolling hash) as done in RabinKarp algorithm follow:
The hash function suggested by Rabin and Karp calculates an integer value. The integer value for a string is numeric value of a string. For example, if all possible characters are from 1 to 10, the numeric value of “122” will be 122. The number of possible characters is higher than 10 (256 in general) and pattern length can be large. So the numeric values cannot be practically stored as an integer. Therefore, the numeric value is calculated using modular arithmetic to make sure that the hash values can be stored in an integer variable (can fit in memory words). To do rehashing, we need to take off the most significant digit and add the new least significant digit for in hash value. Rehashing is done using the following formula.
hash( txt[s+1 .. s+m] ) = ( d ( hash( txt[s .. s+m1]) – txt[s]*h ) + txt[s + m] ) mod q
hash( txt[s .. s+m1] ) : Hash value at shift s.
hash( txt[s+1 .. s+m] ) : Hash value at next shift (or shift s+1)
d: Number of characters in the alphabet
q: A prime number
h: d^(m1)
The idea is similar as we evaluate a mathematical expression. For example, we have a string of “1234” let we compute the value of the substring “12” is 12 and we want to compute the value of the substring “123” this can be calculated as ((12)*10+3)=123, similar logic is applied here.
# importing libraries import sys
import math as mt
t = 1
# store prime to reduce overflow mod = 9007199254740881
for ___ in range (t):
# string to check number of distinct substring
s = 'abcd'
# to store substrings
l = []
# to store hash values by Rabin Karp algorithm
d = {}
for i in range ( len (s)):
suma = 0
pre = 0
# Number of input alphabets
D = 256
for j in range (i, len (s)):
# calculate new hash value by adding next element
pre = (pre * D + ord (s[j])) % mod
# strore string length if non repeat
if d.get(pre,  1 ) = =  1 :
l.append([i, j])
d[pre] = 1
# resulting length
print ( len (l))
# resulting distinct substrings
for i in range ( len (l)):
print (s[l[i][ 0 ]:l[i][ 1 ] + 1 ], end = " " )

10 a ab abc abcd b bc bcd c cd d
Time Complexity: O(N^{2}), N is the length of string
Attention reader! Don’t stop learning now. Get hold of all the important DSA concepts with the DSA Self Paced Course at a studentfriendly price and become industry ready.
Recommended Posts:
 RabinKarp algorithm for Pattern Searching in Matrix
 Find distinct characters in distinct substrings of a string
 Count of distinct substrings of a string using Suffix Trie
 Count of distinct substrings of a string using Suffix Array
 Count of Distinct Substrings occurring consecutively in a given String
 Count number of substrings with exactly k distinct characters
 Count distinct substrings that contain some characters at most k times
 Count number of distinct substrings of a given length
 Count of substrings of length K with exactly K distinct characters
 Count of substrings having all distinct characters
 Count of Substrings with at least K pairwise Distinct Characters having same Frequency
 Generate a String of having N*N distinct nonpalindromic Substrings
 Minimum changes to a string to make all substrings distinct
 Count ways to split a Binary String into three substrings having equal count of zeros
 Given a binary string, count number of substrings that start and end with 1.
 Count of substrings of a binary string containing K ones
 Permutation of given string that maximizes count of Palindromic substrings
 Count the number of vowels occurring in all the substrings of given string
 Count number of substrings of a string consisting of same characters
 Queries to find the count of vowels in the substrings of the given string
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.