Skip to content
Related Articles

Related Articles

Count number of substrings having at least K distinct characters
  • Difficulty Level : Medium
  • Last Updated : 05 May, 2021

Given a string S consisting of N characters and a positive integer K, the task is to count the number of substrings having at least K distinct characters.

Examples:

Input: S = “abcca”, K = 3
Output: 4
Explanation:
The substrings that contain at least K(= 3) distinct characters are:

  1. “abc”: Count of distinct characters = 3.
  2. “abcc”: Count of distinct characters = 3.
  3. “abcca”: Count of distinct characters = 3.
  4. “bcca”: Count of distinct characters = 3.

Therefore, the total count of substrings is 4.

Input: S = “abcca”, K = 4
Output: 0



Naive Approach: The simplest approach to solve the given problem is to generate all substrings of the given string and count those substrings that have at least K distinct characters in them. After checking for all the substrings, print the total count obtained as the result. 
Time Complexity: O(N3)
Auxiliary Space: O(256)

Efficient Approach: The above approach can also be optimized by using the concept of Sliding Window and Hashing. Follow the steps below to solve the problem:

  • Initialize a variable, say ans as 0 to store the count of substrings having at least K distinct characters.
  • Initialize two pointers, begin and end to store the starting and ending point of the sliding window.
  • Initialize a HashMap, say M to store the frequency of characters in the window.
  • Iterate until end is less than N, and perform the following steps:
    • Include the character at the end of the window, by incrementing the value of S[end] in M by 1.
    • Iterate until the size of M becomes less than K, and perform the following steps:
      • Remove the characters from the starting of the window by decrementing the value of S[begin] in M by 1.
      • If its frequency becomes 0, then erase it from the map M.
      • Count all the substrings starting from begin till N by incrementing ans by (N – end + 1).
  • After completing the above steps, print the value of ans as the result.

Below is the implementation of the above approach:

C++




// C++ program for the above approach
 
#include <bits/stdc++.h>
using namespace std;
 
// Function to count number of substrings
// having atleast k distinct characters
void atleastkDistinctChars(string s, int k)
{
    // Stores the size of the string
    int n = s.size();
 
    // Initialize a HashMap
    unordered_map<char, int> mp;
 
    // Stores the start and end
    // indices of sliding window
    int begin = 0, end = 0;
 
    // Stores the required result
    int ans = 0;
 
    // Iterate while the end
    // pointer is less than n
    while (end < n) {
 
        // Include the character at
        // the end of the window
        char c = s[end];
        mp++;
 
        // Increment end pointer by 1
        end++;
 
        // Iterate until count of distinct
        // characters becomes less than K
        while (mp.size() >= k) {
 
            // Remove the character from
            // the beginning of window
            char pre = s[begin];
            mp[pre]--;
 
            // If its frequency is 0,
            // remove it from the map
            if (mp[pre] == 0) {
                mp.erase(pre);
            }
 
            // Update the answer
            ans += s.length() - end + 1;
            begin++;
        }
    }
 
    // Print the result
    cout << ans;
}
 
// Driver Code
int main()
{
    string S = "abcca";
    int K = 3;
    atleastkDistinctChars(S, K);
 
    return 0;
}

Java




// Java program for the above approach
import java.util.*;
class GFG
{
   
// Function to count number of substrings
// having atleast k distinct characters
static void atleastkDistinctChars(String s, int k)
{
   
    // Stores the size of the string
    int n = s.length();
 
    // Initialize a HashMap
    Map<Character, Integer> mp = new HashMap<>();
 
    // Stores the start and end
    // indices of sliding window
    int begin = 0, end = 0;
 
    // Stores the required result
    int ans = 0;
 
    // Iterate while the end
    // pointer is less than n
    while (end < n) {
 
        // Include the character at
        // the end of the window
        char c = s.charAt(end);
        mp.put(c,mp.getOrDefault(c,0)+1);
 
        // Increment end pointer by 1
        end++;
 
        // Iterate until count of distinct
        // characters becomes less than K
        while (mp.size() >= k) {
 
            // Remove the character from
            // the beginning of window
            char pre = s.charAt(begin);
            mp.put(pre,mp.getOrDefault(pre,0)-1);
 
            // If its frequency is 0,
            // remove it from the map
            if (mp.get(pre)==0){
                mp.remove(pre);
            }
 
            // Update the answer
            ans += s.length() - end + 1;
            begin++;
        }
    }
 
    // Print the result
    System.out.println(ans);
}
   
  // Driver code
public static void main (String[] args)
{
   
      // Given inputs
    String S = "abcca";
    int K = 3;
    atleastkDistinctChars(S, K);
 
    }
}
 
// This code is contributed by offbeat

Python3




# Python 3 program for the above approach
from collections import defaultdict
 
# Function to count number of substrings
# having atleast k distinct characters
def atleastkDistinctChars(s, k):
 
    # Stores the size of the string
    n = len(s)
 
    # Initialize a HashMap
    mp = defaultdict(int)
 
    # Stores the start and end
    # indices of sliding window
    begin = 0
    end = 0
 
    # Stores the required result
    ans = 0
 
    # Iterate while the end
    # pointer is less than n
    while (end < n):
 
        # Include the character at
        # the end of the window
        c = s[end]
        mp += 1
 
        # Increment end pointer by 1
        end += 1
 
        # Iterate until count of distinct
        # characters becomes less than K
        while (len(mp) >= k):
 
            # Remove the character from
            # the beginning of window
            pre = s[begin]
            mp[pre] -= 1
 
            # If its frequency is 0,
            # remove it from the map
            if (mp[pre] == 0):
                del mp[pre]
 
            # Update the answer
            ans += len(s) - end + 1
            begin += 1
 
    # Print the result
    print(ans)
 
 
# Driver Code
if __name__ == "__main__":
 
    S = "abcca"
    K = 3
    atleastkDistinctChars(S, K)
 
    # This code is contributed by ukasp.
Output: 
4

 

Time Complexity: O(N)
Auxiliary Space: O(256)

My Personal Notes arrow_drop_up
Recommended Articles
Page :