Given a string of length n of lowercase alphabet characters, we need to count total number of distinct substrings of this string.
Input : str = “ababa” Output : 10 Total number of distinct substring are 10, which are, "", "a", "b", "ab", "ba", "aba", "bab", "abab", "baba" and "ababa"
We have discussed a Suffix Trie based solution in below post :
Count of distinct substrings of a string using Suffix Trie
We can solve this problem using suffix array and longest common prefix concept. A suffix array is a sorted array of all suffixes of a given string.
For string “ababa” suffixes are : “ababa”, “baba”, “aba”, “ba”, “a”. After taking these suffixes in sorted form we get our suffix array as [4, 2, 0, 3, 1]
Then we calculate lcp array using kasai’s algorithm. For string “ababa”, lcp array is [1, 3, 0, 2, 0]
After constructing both arrays, we calculate total number of distinct substring by keeping this fact in mind : If we look through the prefixes of each suffix of a string, we cover all substrings of that string.
We will explain the procedure for above example,
String = “ababa” Suffixes in sorted order : “a”, “aba”, “ababa”, “ba”, “baba” Initializing distinct substring count by length of first suffix, Count = length(“a”) = 1 Substrings taken in consideration : “a” Now we consider each consecutive pair of suffix, lcp("a", "aba") = "a". All characters that are not part of the longest common prefix contribute to a distinct substring. In the above case, they are 'b' and ‘a'. So they should be added to Count. Count += length(“aba”) - lcp(“a”, “aba”) Count = 3 Substrings taken in consideration : “aba”, “ab” Similarly for next pair also, Count += length(“ababa”) - lcp(“aba”, “ababa”) Count = 5 Substrings taken in consideration : “ababa”, “abab” Count += length(“ba”) - lcp(“ababa”, “ba”) Count = 7 Substrings taken in consideration : “ba”, “b” Count += length(“baba”) - lcp(“ba”, “baba”) Count = 9 Substrings taken in consideration : “baba”, “bab” We finally add 1 for empty string. count = 10
Above idea is implemented in below code.
This article is contributed by Utkarsh Trivedi. If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to firstname.lastname@example.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.
Don’t stop now and take your learning to the next level. Learn all the important concepts of Data Structures and Algorithms with the help of the most trusted course: DSA Self Paced. Become industry ready at a student-friendly price.
- Count of distinct substrings of a string using Suffix Trie
- Find distinct characters in distinct substrings of a string
- Count number of substrings with exactly k distinct characters
- Count distinct substrings that contain some characters at most k times
- Count number of distinct substrings of a given length
- Count of substrings of length K with exactly K distinct characters
- Minimum changes to a string to make all substrings distinct
- Count of substrings of a binary string containing K ones
- Count number of substrings of a string consisting of same characters
- Count the number of vowels occurring in all the substrings of given string
- Queries to find the count of vowels in the substrings of the given string
- Given a binary string, count number of substrings that start and end with 1.
- Permutation of given string that maximizes count of Palindromic substrings
- Count of suffix increment/decrement operations to construct a given array
- Suffix Tree Application 4 - Build Linear Time Suffix Array
- Reverse the substrings of the given String according to the given Array of indices
- Queries for number of distinct integers in Suffix
- Check whether count of distinct characters in a string is Prime or not
- Count distinct elements in an array
- Count ways to partition a string such that both parts have equal distinct characters