Given a string of length n of lowercase alphabet characters, we need to count total number of distinct substrings of this string.
Input : str = “ababa” Output : 10 Total number of distinct substring are 10, which are, "", "a", "b", "ab", "ba", "aba", "bab", "abab", "baba" and "ababa"
We have discussed a Suffix Trie based solution in below post :
Count of distinct substrings of a string using Suffix Trie
We can solve this problem using suffix array and longest common prefix concept. A suffix array is a sorted array of all suffixes of a given string.
For string “ababa” suffixes are : “ababa”, “baba”, “aba”, “ba”, “a”. After taking these suffixes in sorted form we get our suffix array as [4, 2, 0, 3, 1]
Then we calculate lcp array using kasai’s algorithm. For string “ababa”, lcp array is [1, 3, 0, 2, 0]
After constructing both arrays, we calculate total number of distinct substring by keeping this fact in mind : If we look through the prefixes of each suffix of a string, we cover all substrings of that string.
We will explain the procedure for above example,
String = “ababa” Suffixes in sorted order : “a”, “aba”, “ababa”, “ba”, “baba” Initializing distinct substring count by length of first suffix, Count = length(“a”) = 1 Substrings taken in consideration : “a” Now we consider each consecutive pair of suffix, lcp("a", "aba") = "a". All characters that are not part of the longest common prefix contribute to a distinct substring. In the above case, they are 'b' and ‘a'. So they should be added to Count. Count += length(“aba”) - lcp(“a”, “aba”) Count = 3 Substrings taken in consideration : “aba”, “ab” Similarly for next pair also, Count += length(“ababa”) - lcp(“aba”, “ababa”) Count = 5 Substrings taken in consideration : “ababa”, “abab” Count += length(“ba”) - lcp(“ababa”, “ba”) Count = 7 Substrings taken in consideration : “ba”, “b” Count += length(“baba”) - lcp(“ba”, “baba”) Count = 9 Substrings taken in consideration : “baba”, “bab” We finally add 1 for empty string. count = 10
Above idea is implemented in below code.
This article is contributed by Utkarsh Trivedi. If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to firstname.lastname@example.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.
Attention reader! Don’t stop learning now. Get hold of all the important DSA concepts with the DSA Self Paced Course at a student-friendly price and become industry ready.
- Count of distinct substrings of a string using Suffix Trie
- Suffix Tree Application 4 - Build Linear Time Suffix Array
- Find distinct characters in distinct substrings of a string
- Count distinct substrings of a string using Rabin Karp algorithm
- Count of Distinct Substrings occurring consecutively in a given String
- Queries for number of distinct integers in Suffix
- Count number of substrings with exactly k distinct characters
- Count distinct substrings that contain some characters at most k times
- Count number of distinct substrings of a given length
- Count of substrings of length K with exactly K distinct characters
- Count of substrings having all distinct characters
- Count of Substrings with at least K pairwise Distinct Characters having same Frequency
- Generate a String of having N*N distinct non-palindromic Substrings
- Minimum changes to a string to make all substrings distinct
- Count ways to split a Binary String into three substrings having equal count of zeros
- Longest palindromic string formed by concatenation of prefix and suffix of a string
- Print the longest prefix of the given string which is also the suffix of the same string
- Find the longest sub-string which is prefix, suffix and also present inside the string
- Find the longest sub-string which is prefix, suffix and also present inside the string | Set 2
- Count of suffix increment/decrement operations to construct a given array