Queries to find total number of duplicate character in range L to R in the string S
Given a string S of size N consisting of lower case alphabets and an integer Q which represents the number of queries for S. Our task is to print the number of duplicate characters in the substring L to R for all the queries Q.
Note: 1 ≤N ≤ 106 and 1 ≤ Q≤ 106
S = “geeksforgeeks”, Q = 2
L = 1 R = 5
L = 4 R = 8
For the first query ‘e’ is the only duplicate character in S from range 1 to 5.
For the second query there is no duplicate character in S.
S = “Geekyy”, Q = 1
L = 1 R = 6
For the first query ‘e’ and ‘y’ are duplicate characters in S from range 1 to 6.
The naive approach would be to maintain a frequency array of size 26, to store the count of each character. For each query, given a range [L, R] we will traverse substring S[L] to S[R] and keep counting the occurrence of each character. Now, if the frequency of any character is greater than 1 then we would add 1 to answer.
To solve the above problem in an efficient way we will store the position of each character as it appears in the string in a dynamic array. For each given query we will iterate over all the 26 lower case alphabets. If the current letter is in the substring S[L: R] then the next element of the first element which is greater than or equal L to in the corresponding vector should exist and be less than or equal to R.
Diagram below shows how we store characters in the dynamic array:
Below is the implementation of the above approach:
Time complexity: O( Q * 26 * log N)