Minimum length substring with exactly K distinct characters

Given a string S and a number K. The task is to find the minimum length substring having exactly K distinct characters.

Note: The string S consists of only lowercase English alphabets.

Examples:

Input:  S = "ababcb", K = 3
Output:  abc

Input:  S="efecfefd", K = 4
Output:  cfefd


Simple Solution: simple solution is to consider each substring and check if it contains k distinct characters. If yes then compare the length of this substring with minimum length substring found earlier. Time complexity of this approach is O(N2), where N is the length of the String S.

Efficient Solution: An efficient solution is to use sliding window technique and hashing. The idea is to use two pointers st and end to denote starting and ending point of sliding window. Initially point both to beginning of the string. Move end forward and increment count of corresponding character. If count is one then a new distinct character is found and increment count of number of distinct characters. If count of number of distinct characters is greater than k then move st forward and decrease count of character. If character count is zero then a distinct character is removed and count of distinct elements can be reduced to k this way. If count of distinct elements is k, then remove characters from beginning of sliding window having count greater than 1 by moving st forward. Compare length of current sliding window with minimum length found so far and update if necessary.

Note that each character is added and removed from sliding window at most once, so each character is traversed twice. Hence the time complexity is linear.

Below is the implementation of above approach:

C++

filter_none

edit
close

play_arrow

link
brightness_4
code

// C++ program to find minimum length substring
// having exactly k distinct character.
  
#include <bits/stdc++.h>
using namespace std;
  
// Function to find minimum length substring
// having exactly k distinct character.
string findMinLenStr(string str, int k)
{
    int n = str.length();
  
    // Starting index of sliding window.
    int st = 0;
  
    // Ending index of sliding window.
    int end = 0;
  
    // To store count of character.
    int cnt[26];
    memset(cnt, 0, sizeof(cnt));
  
    // To store count of distinct
    // character in current sliding
    // window.
    int distEle = 0;
  
    // To store length of current
    // sliding window.
    int currlen;
  
    // To store minimum length.
    int minlen = n;
  
    // To store starting index of minimum
    // length substring.
    int startInd = -1;
  
    while (end < n) {
  
        // Increment count of current character
        // If this count is one then a new
        // distinct character is found in
        // sliding window.
        cnt[str[end] - 'a']++;
        if (cnt[str[end] - 'a'] == 1)
            distEle++;
  
        // If number of distinct characters is
        // is greater than k, then move starting
        // point of sliding window forward,
        // until count is k.
        if (distEle > k) {
            while (st < end && distEle > k) {
                if (cnt[str[st] - 'a'] == 1)
                    distEle--;
                cnt[str[st] - 'a']--;
                st++;
            }
        }
  
        // Remove characters from the beginning of
        // sliding window having count more than 1
        // to minimize length.
        if (distEle == k) {
            while (st < end && cnt[str[st] - 'a'] > 1) {
                cnt[str[st] - 'a']--;
                st++;
            }
  
            // Comapre length with minimum length
            // and update if required.
            currlen = end - st + 1;
            if (currlen < minlen) {
                minlen = currlen;
                startInd = st;
            }
        }
  
        end++;
    }
  
    // Return minimum length  substring.
    return str.substr(startInd, minlen);
}
  
// Driver code
int main()
{
    string str = "efecfefd";
  
    int k = 4;
  
    cout << findMinLenStr(str, k);
  
    return 0;
}

chevron_right


Java

filter_none

edit
close

play_arrow

link
brightness_4
code

// Java program to find minimum length subString 
// having exactly k distinct character. 
class GFG
{
  
// Function to find minimum length subString 
// having exactly k distinct character. 
static String findMinLenStr(String str, int k) 
    int n = str.length(); 
  
    // Starting index of sliding window. 
    int st = 0
  
    // Ending index of sliding window. 
    int end = 0
  
    // To store count of character. 
    int cnt[] = new int[26]; 
    for(int i = 0; i < 26; i++)cnt[i] = 0;
  
    // To store count of distinct 
    // character in current sliding 
    // window. 
    int distEle = 0
  
    // To store length of current 
    // sliding window. 
    int currlen; 
  
    // To store minimum length. 
    int minlen = n; 
  
    // To store starting index of minimum 
    // length subString. 
    int startInd = -1
  
    while (end < n) 
    
  
        // Increment count of current character 
        // If this count is one then a new 
        // distinct character is found in 
        // sliding window. 
        cnt[str.charAt(end) - 'a']++; 
        if (cnt[str.charAt(end) - 'a'] == 1
            distEle++; 
  
        // If number of distinct characters is 
        // is greater than k, then move starting 
        // point of sliding window forward, 
        // until count is k. 
        if (distEle > k)
        
            while (st < end && distEle > k) 
            
                if (cnt[str.charAt(st) - 'a'] == 1
                    distEle--; 
                cnt[str.charAt(st) - 'a']--; 
                st++; 
            
        
  
        // Remove characters from the beginning of 
        // sliding window having count more than 1 
        // to minimize length. 
        if (distEle == k)
        
            while (st < end && cnt[str.charAt(st) - 'a'] > 1
            
                cnt[str.charAt(st) - 'a']--; 
                st++; 
            
  
            // Comapre length with minimum length 
            // and update if required. 
            currlen = end - st + 1
            if (currlen < minlen) 
            
                minlen = currlen; 
                startInd = st; 
            
        
  
        end++; 
    
  
    // Return minimum length subString. 
    return str.substring(startInd,startInd + minlen); 
  
// Driver code 
public static void main(String args[])
    String str = "efecfefd"
    int k = 4
    System.out.println(findMinLenStr(str, k)); 
}
  
// This code is contributed by Arnab Kundu

chevron_right


Python 3

# Python 3 program to find minimum length
# substring having exactly k distinct character.

# Function to find minimum length substring
# having exactly k distinct character.
def findMinLenStr(str, k):

n = len(str)

# Starting index of sliding window.
st = 0

# Ending index of sliding window.
end = 0

# To store count of character.
cnt = [0] * 26

# To store count of distinct
# character in current sliding
# window.
distEle = 0

# To store length of current
# sliding window.
currlen =0

# To store minimum length.
minlen = n

# To store starting index of minimum
# length substring.
startInd = -1

while (end < n): # Increment count of current character # If this count is one then a new # distinct character is found in # sliding window. cnt[ord(str[end]) - ord('a')] += 1 if (cnt[ord(str[end]) - ord('a')] == 1): distEle += 1 # If number of distinct characters is # is greater than k, then move starting # point of sliding window forward, # until count is k. if (distEle > k):
while (st < end and distEle > k):
if (cnt[ord(str[st]) –
ord(‘a’)] == 1):
distEle -= 1
cnt[ord(str[st]) – ord(‘a’)] -= 1
st += 1

# Remove characters from the beginning of
# sliding window having count more than 1
# to minimize length.
if (distEle == k):
while (st < end and cnt[ord(str[st]) - ord('a')] > 1):
cnt[ord(str[st]) – ord(‘a’)] -= 1
st += 1

# Comapre length with minimum length
# and update if required.
currlen = end – st + 1
if (currlen < minlen): minlen = currlen startInd = st end += 1 # Return minimum length substring. return str[startInd : startInd + minlen] # Driver code if __name__ == "__main__": str = "efecfefd" k = 4 print(findMinLenStr(str, k)) # This code is contributed by Ita_c [tabby title = "C#"]

filter_none

edit
close

play_arrow

link
brightness_4
code

// C# program to find minimum length subString 
// having exactly k distinct character. 
using System;
  
class GFG 
  
    // Function to find minimum length subString 
    // having exactly k distinct character. 
    static String findMinLenStr(string str, int k) 
    
        int n = str.Length; 
      
        // Starting index of sliding window. 
        int st = 0; 
      
        // Ending index of sliding window. 
        int end = 0; 
      
        // To store count of character. 
        int []cnt = new int[26]; 
        for(int i = 0; i < 26; i++)cnt[i] = 0; 
      
        // To store count of distinct 
        // character in current sliding 
        // window. 
        int distEle = 0; 
      
        // To store length of current 
        // sliding window. 
        int currlen; 
      
        // To store minimum length. 
        int minlen = n; 
      
        // To store starting index of minimum 
        // length subString. 
        int startInd = -1; 
      
        while (end < n) 
        
      
            // Increment count of current character 
            // If this count is one then a new 
            // distinct character is found in 
            // sliding window. 
            cnt[str[end] - 'a']++; 
            if (cnt[str[end] - 'a'] == 1) 
                distEle++; 
      
            // If number of distinct characters is 
            // is greater than k, then move starting 
            // point of sliding window forward, 
            // until count is k. 
            if (distEle > k) 
            
                while (st < end && distEle > k) 
                
                    if (cnt[str[st] - 'a'] == 1) 
                        distEle--; 
                    cnt[str[st] - 'a']--; 
                    st++; 
                
            
      
            // Remove characters from the beginning of 
            // sliding window having count more than 1 
            // to minimize length. 
            if (distEle == k) 
            
                while (st < end && cnt[str[st] - 'a'] > 1) 
                
                    cnt[str[st] - 'a']--; 
                    st++; 
                
      
                // Comapre length with minimum length 
                // and update if required. 
                currlen = end - st + 1; 
                if (currlen < minlen) 
                
                    minlen = currlen; 
                    startInd = st; 
                
            
      
            end++; 
        
      
        // Return minimum length subString. 
        return str.Substring(startInd, minlen); 
    
      
    // Driver code 
    public static void Main() 
    
        string str = "efecfefd"
        int k = 4; 
        Console.WriteLine(findMinLenStr(str, k)); 
    
  
// This code is contributed by Ryuga

chevron_right


Output:

cfefd

Time Complexity: O(N), where N is the length of the given string.
Auxiliary Space: O(1)



My Personal Notes arrow_drop_up

A Programmer and A Machine learning Enthusiast

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.



Improved By : andrew1234, Ryuga, Ita_c