Given a string of length n of lowercase alphabet characters, we need to count total number of distinct substrings of this string.
Input : str = “ababa” Output : 10 Total number of distinct substring are 10, which are, "", "a", "b", "ab", "ba", "aba", "bab", "abab", "baba" and "ababa"
We have discussed a Suffix Trie based solution in below post :
Count of distinct substrings of a string using Suffix Trie
We can solve this problem using suffix array and longest common prefix concept. A suffix array is a sorted array of all suffixes of a given string.
For string “ababa” suffixes are : “ababa”, “baba”, “aba”, “ba”, “a”. After taking these suffixes in sorted form we get our suffix array as [4, 2, 0, 3, 1]
Then we calculate lcp array using kasai’s algorithm. For string “ababa”, lcp array is [1, 3, 0, 2, 0]
After constructing both arrays, we calculate total number of distinct substring by keeping this fact in mind : If we look through the prefixes of each suffix of a string, we cover all substrings of that string.
We will explain the procedure for above example,
String = “ababa” Suffixes in sorted order : “a”, “aba”, “ababa”, “ba”, “baba” Initializing distinct substring count by length of first suffix, Count = length(“a”) = 1 Substrings taken in consideration : “a” Now we consider each consecutive pair of suffix, lcp("a", "aba") = "a". All characters that are not part of the longest common prefix contribute to a distinct substring. In the above case, they are 'b' and ‘a'. So they should be added to Count. Count += length(“aba”) - lcp(“a”, “aba”) Count = 3 Substrings taken in consideration : “aba”, “ab” Similarly for next pair also, Count += length(“ababa”) - lcp(“aba”, “ababa”) Count = 5 Substrings taken in consideration : “ababa”, “abab” Count += length(“ba”) - lcp(“ababa”, “ba”) Count = 7 Substrings taken in consideration : “ba”, “b” Count += length(“baba”) - lcp(“ba”, “baba”) Count = 9 Substrings taken in consideration : “baba”, “bab” We finally add 1 for empty string. count = 10
Above idea is implemented in below code.
This article is contributed by Utkarsh Trivedi. If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.
Attention reader! Don’t stop learning now. Get hold of all the important DSA concepts with the DSA Self Paced Course at a student-friendly price and become industry ready.
- Count of distinct substrings of a string using Suffix Trie
- Count of Distinct Substrings occurring consecutively in a given String
- Count distinct substrings of a string using Rabin Karp algorithm
- Find distinct characters in distinct substrings of a string
- Count of substrings having all distinct characters
- Count number of distinct substrings of a given length
- Count distinct substrings that contain some characters at most k times
- Count number of substrings with exactly k distinct characters
- Count of substrings of length K with exactly K distinct characters
- Minimum changes to a string to make all substrings distinct
- Generate a String of having N*N distinct non-palindromic Substrings
- Count of bitonic substrings from the given string
- Count of substrings of a binary string containing K ones
- Count of Reverse Bitonic Substrings in a given String
- Count of substrings of a given Binary string with all characters same
- Count of suffix increment/decrement operations to construct a given array
- Suffix Tree Application 4 - Build Linear Time Suffix Array
- Permutation of given string that maximizes count of Palindromic substrings
- Queries to find the count of vowels in the substrings of the given string
- Given a binary string, count number of substrings that start and end with 1.