# Count number of substrings with exactly k distinct characters

Given a string of lowercase alphabets, count all possible substrings (not necessarily distinct) that has exactly k distinct characters.
Examples:

```Input: abc, k = 2
Output: 2
Possible substrings are {"ab", "bc"}

Input: aba, k = 2
Output: 3
Possible substrings are {"ab", "ba", "aba"}

Input: aa, k = 1
Output: 3
Possible substrings are {"a", "a", "aa"}
```

Method 1 (Brute Force)

If the length of string is n, then there can be n*(n+1)/2 possible substrings. A simple way is to generate all the substring and check each one whether it has exactly k unique characters or not. If we apply this brute force, it would take O(n*n) to generate all substrings and O(n) to do a check on each one. Thus overall it would go O(n*n*n).

Method 2

The problem can be solved in O(n*n). Idea is to maintain a hash table while generating substring and checking the number of unique characters using that hash table.
The implementation below assume that the input string contains only characters from ‘a’ to ‘z’.

Implementation

## C++

 `// C++ program to count number of substrings with ` `// exactly k distinct characters in a given string ` `#include ` `using` `namespace` `std; ` ` `  `// Function to count number of substrings ` `// with exactly k unique characters ` `int` `countkDist(string str, ``int` `k) ` `{ ` `    ``int` `n = str.length(); ` ` `  `    ``// Initialize result ` `    ``int` `res = 0; ` ` `  `    ``// To store count of characters from 'a' to 'z' ` `    ``int` `cnt; ` ` `  `    ``// Consider all substrings beginning with ` `    ``// str[i] ` `    ``for` `(``int` `i = 0; i < n; i++) ` `    ``{ ` `        ``int` `dist_count = 0; ` ` `  `        ``// Initializing array with 0 ` `        ``memset``(cnt, 0, ``sizeof``(cnt)); ` ` `  `        ``// Consider all substrings between str[i..j] ` `        ``for` `(``int` `j=i; j k) ``break``; ` `        ``} ` `    ``} ` ` `  `    ``return` `res; ` `} ` ` `  `// Driver Program ` `int` `main() ` `{ ` `    ``string str = ``"abcbaa"``; ` `    ``int` `k = 3; ` `    ``cout << ``"Total substrings with exactly "` `         ``<< k <<``" distinct characters :"` `         ``<< countkDist(str, k) << endl; ` `    ``return` `0; ` `} `

## Java

 `// Java program to CountKSubStr number of substrings ` `// with exactly distinct characters in a given string ` `import` `java.util.Arrays; ` ` `  `public` `class` `CountKSubStr ` `{ ` `    ``// Function to count number of substrings ` `    ``// with exactly k unique characters ` `    ``int` `countkDist(String str, ``int` `k) ` `    ``{ ` `        ``// Initialize result ` `        ``int` `res = ``0``; ` ` `  `        ``int` `n = str.length(); ` ` `  `        ``// To store count of characters from 'a' to 'z' ` `        ``int` `cnt[] = ``new` `int``[``26``]; ` ` `  `        ``// Consider all substrings beginning with ` `        ``// str[i] ` `        ``for` `(``int` `i = ``0``; i < n; i++) ` `        ``{ ` `            ``int` `dist_count = ``0``; ` ` `  `            ``// Initializing count array with 0 ` `            ``Arrays.fill(cnt, ``0``); ` ` `  `            ``// Consider all substrings between str[i..j] ` `            ``for` `(``int` `j=i; j

## Python 3

 `# Python3 program to count number of  ` `# substrings with exactly k distinct  ` `# characters in a given string ` ` `  `# Function to count number of substrings  ` `# with exactly k unique characters  ` `def` `countkDist(str1, k): ` `    ``n ``=` `len``(str1) ` `     `  `    ``# Initialize result ` `    ``res ``=` `0` ` `  `    ``# To store count of characters from  ` `    ``# 'a' to 'z'  ` `    ``cnt ``=` `[``0``] ``*` `27` ` `  `    ``# Consider all substrings beginning  ` `    ``# with str[i]  ` `    ``for` `i ``in` `range``(``0``, n): ` `        ``dist_count ``=` `0` ` `  `        ``# Initializing array with 0  ` `        ``cnt ``=` `[``0``] ``*` `27` ` `  `        ``# Consider all substrings between str[i..j]  ` `        ``for` `j ``in` `range``(i, n): ` `             `  `            ``# If this is a new character for this  ` `            ``# substring, increment dist_count.  ` `            ``if``(cnt[``ord``(str1[j]) ``-` `97``] ``=``=` `0``): ` `                ``dist_count ``+``=` `1` ` `  `            ``# Increment count of current character ` `            ``cnt[``ord``(str1[j]) ``-` `97``] ``+``=` `1` ` `  `            ``# If distinct character count becomes k,  ` `            ``# then increment result. ` `            ``if``(dist_count ``=``=` `k): ` `                ``res ``+``=` `1` `            ``if``(dist_count > k): ` `                ``break` ` `  `    ``return` `res      ` ` `  `# Driver Code ` `if` `__name__ ``=``=` `"__main__"``: ` `    ``str1 ``=` `"abcbaa"` `    ``k ``=` `3` `    ``print``(``"Total substrings with exactly"``, k,  ` `           ``"distinct characters : "``, end ``=` `"") ` `    ``print``(countkDist(str1, k)) ` ` `  `# This code is contributed by  ` `# Sairahul Jella `

## C#

 `// C# program to CountKSubStr number of substrings ` `// with exactly distinct characters in a given string ` ` `  `  `  `using` `System; ` `public` `class` `CountKSubStr ` `{ ` `    ``// Function to count number of substrings ` `    ``// with exactly k unique characters ` `    ``int` `countkDist(``string` `str, ``int` `k) ` `    ``{ ` `        ``// Initialize result ` `        ``int` `res = 0; ` `  `  `        ``int` `n = str.Length; ` `  `  `        ``// To store count of characters from 'a' to 'z' ` `        ``int``[] cnt = ``new` `int``; ` `  `  `        ``// Consider all substrings beginning with ` `        ``// str[i] ` `        ``for` `(``int` `i = 0; i < n; i++) ` `        ``{ ` `            ``int` `dist_count = 0; ` `  `  `            ``// Initializing count array with 0 ` `            ``Array.Clear(cnt, 0,cnt.Length); ` `  `  `            ``// Consider all substrings between str[i..j] ` `            ``for` `(``int` `j=i; j

## PHP

 `

Output:

```Total substrings with exactly 3 distinct characters : 8
```

Time Complexity : O(n*n)

Exercise (Further Optimization):
The above code resets count array “cnt[]” in every iteration of outer loop. This can be very costly for large alphabet size. Can we modify the above program such that cnt[] is not reset every time?

