Longest substring with K unique characters using Binary Search

Given a string str and an integer K, the task is to print the length of the longest possible substring that has exactly K unique characters. If there are more than one substring of the longest possible length then print any one of them or print -1 if there is no such substring possible.

Examples:

Input: str = “aabacbebebe”, K = 3
Output: 7
“cbebebe” is the requried substring.

Input: str = “aabc”, K = 4
Output: -1

Approach: An approach to solve this problem has been discussed in this article. In this article, a binary search based approach will be discussed. Binary search will be applied on the length of the substring which has at least K unique characters. Let’s say we try for length len and check whether a substring of size len is there which is having at least k unique characters. If it is possible then try to maximize the size by searching from this length to the maximum possible length i.e. size of the input string. If it is not possible then search for lower size len.
To check that the length given by binary search will have k unique characters, a set can be used to insert all the characters and then if the size of set is less then k then the answer is not possible else the answer given by the binary search is the max answer.
Binary search is applicable here because it is known if for some len the answer is possible and we want to maximize the len so the search domain changes and we search from this len to n.



Below is the implementation of the above approach:

C++

filter_none

edit
close

play_arrow

link
brightness_4
code

// C++ implementation of the approach
#include <bits/stdc++.h>
using namespace std;
  
// Function that returns true if there
// is a substring of length len
// with <=k unique characters
bool isValidLen(string s, int len, int k)
{
  
    // Size of the string
    int n = s.size();
  
    // Map to store the characters
    // and their frequency
    unordered_map<char, int> mp;
    int right = 0;
  
    // Update the map for the
    // first substring
    while (right < len) {
        mp[s[right]]++;
        right++;
    }
  
    if (mp.size() <= k)
        return true;
  
    // Check for the rest of the substrings
    while (right < n) {
  
        // Add the new character
        mp[s[right]]++;
  
        // Remove the first character
        // of the previous window
        mp[s[right - len]]--;
  
        // Update the map
        if (mp[s[right - len]] == 0)
            mp.erase(s[right - len]);
        if (mp.size() <= k)
            return true;
        right++;
    }
    return mp.size() <= k;
}
  
// Function to return the length of the
// longest substring which has K
// unique characters
int maxLenSubStr(string s, int k)
{
  
    // Check if the complete string
    // contains K unique characters
    set<char> uni;
    for (auto x : s)
        uni.insert(x);
    if (uni.size() < k)
        return -1;
  
    // Size of the string
    int n = s.size();
  
    // Apply binary search
    int lo = -1, hi = n + 1;
    while (hi - lo > 1) {
        int mid = lo + hi >> 1;
        if (isValidLen(s, mid, k))
            lo = mid;
        else
            hi = mid;
    }
    return lo;
}
  
// Driver code
int main()
{
    string s = "aabacbebebe";
    int k = 3;
  
    cout << maxLenSubStr(s, k);
  
    return 0;
}

chevron_right


Java

filter_none

edit
close

play_arrow

link
brightness_4
code

// Java implementation of the approach
import java.util.*;
  
class GFG 
{
  
    // Function that returns true if there
    // is a subString of length len
    // with <=k unique characters
    static boolean isValidLen(String s, 
                              int len, int k) 
    {
  
        // Size of the String
        int n = s.length();
  
        // Map to store the characters
        // and their frequency
        Map<Character, 
            Integer> mp = new HashMap<Character,
                                      Integer>();
        int right = 0;
  
        // Update the map for the
        // first subString
        while (right < len)
        {
            if (mp.containsKey(s.charAt(right))) 
            {
                mp.put(s.charAt(right), 
                mp.get(s.charAt(right)) + 1);
            
            else
            {
                mp.put(s.charAt(right), 1);
            }
            right++;
        }
  
        if (mp.size() <= k)
            return true;
  
        // Check for the rest of the subStrings
        while (right < n) 
        {
  
            // Add the new character
            if (mp.containsKey(s.charAt(right))) 
            {
                mp.put(s.charAt(right), 
                mp.get(s.charAt(right)) + 1);
            
            else 
            {
                mp.put(s.charAt(right), 1);
            }
  
            // Remove the first character
            // of the previous window
            if (mp.containsKey(s.charAt(right - len)))
            {
                mp.put(s.charAt(right - len),
                mp.get(s.charAt(right - len)) - 1);
            }
  
            // Update the map
            if (mp.get(s.charAt(right - len)) == 0)
                mp.remove(s.charAt(right - len));
            if (mp.size() <= k)
                return true;
            right++;
        }
        return mp.size() <= k;
    }
  
    // Function to return the length of the
    // longest subString which has K
    // unique characters
    static int maxLenSubStr(String s, int k) 
    {
  
        // Check if the complete String
        // contains K unique characters
        Set<Character> uni = new HashSet<Character>();
        for (Character x : s.toCharArray())
            uni.add(x);
        if (uni.size() < k)
            return -1;
  
        // Size of the String
        int n = s.length();
  
        // Apply binary search
        int lo = -1, hi = n + 1;
        while (hi - lo > 1
        {
            int mid = lo + hi >> 1;
            if (isValidLen(s, mid, k))
                lo = mid;
            else
                hi = mid;
        }
        return lo;
    }
  
    // Driver code
    public static void main(String[] args) 
    {
        String s = "aabacbebebe";
        int k = 3;
  
        System.out.print(maxLenSubStr(s, k));
    }
}
  
// This code is contributed by Rajput-Ji

chevron_right


Python3

filter_none

edit
close

play_arrow

link
brightness_4
code

# Python3 implementation of the approach
  
# Function that returns True if there
# is a sub of length len
# with <=k unique characters
def isValidLen(s, lenn, k):
  
    # Size of the
    n = len(s)
  
    # Map to store the characters
    # and their frequency
    mp = dict()
    right = 0
  
    # Update the map for the
    # first sub
    while (right < lenn):
        mp[s[right]] = mp.get(s[right], 0) + 1
        right += 1
  
    if (len(mp) <= k):
        return True
  
    # Check for the rest of the subs
    while (right < n):
  
        # Add the new character
        mp[s[right]] = mp.get(s[right], 0) + 1
  
        # Remove the first character
        # of the previous window
        mp[s[right - lenn]] -= 1
  
        # Update the map
        if (mp[s[right - lenn]] == 0):
            del mp[s[right - lenn]]
        if (len(mp) <= k):
            return True
        right += 1
  
    return len(mp)<= k
  
# Function to return the length of the
# longest sub which has K
# unique characters
def maxLenSubStr(s, k):
  
    # Check if the complete
    # contains K unique characters
    uni = dict()
    for x in s:
        uni[x] = 1
    if (len(uni) < k):
        return -1
  
    # Size of the
    n = len(s)
  
    # Apply binary search
    lo = -1
    hi = n + 1
    while (hi - lo > 1):
        mid = lo + hi >> 1
        if (isValidLen(s, mid, k)):
            lo = mid
        else:
            hi = mid
  
    return lo
  
# Driver code
s = "aabacbebebe"
k = 3
  
print(maxLenSubStr(s, k))
  
# This code is contributed by Mohit Kumar

chevron_right


C#

filter_none

edit
close

play_arrow

link
brightness_4
code

// C# implementation of the approach
using System;
using System.Collections.Generic;
  
class GFG 
{
  
    // Function that returns true if there
    // is a subString of length len
    // with <=k unique characters
    static bool isValidLen(String s, 
                           int len, int k) 
    {
  
        // Size of the String
        int n = s.Length;
  
        // Map to store the characters
        // and their frequency
        Dictionary<char
                   int> mp = new Dictionary<char,
                                            int>();
        int right = 0;
  
        // Update the map for the
        // first subString
        while (right < len)
        {
            if (mp.ContainsKey(s[right])) 
            {
                mp[s[right]] = mp[s[right]] + 1;
            
            else
            {
                mp.Add(s[right], 1);
            }
            right++;
        }
  
        if (mp.Count <= k)
            return true;
  
        // Check for the rest of the subStrings
        while (right < n) 
        {
  
            // Add the new character
            if (mp.ContainsKey(s[right])) 
            {
                mp[s[right]] = mp[s[right]] + 1;
            
            else
            {
                mp.Add(s[right], 1);
            }
  
            // Remove the first character
            // of the previous window
            if (mp.ContainsKey(s[right - len]))
            {
                mp[s[right - len]] = mp[s[right - len]] - 1;
            }
  
            // Update the map
            if (mp[s[right - len]] == 0)
                mp.Remove(s[right - len]);
            if (mp.Count <= k)
                return true;
            right++;
        }
        return mp.Count <= k;
    }
  
    // Function to return the length of the
    // longest subString which has K
    // unique characters
    static int maxLenSubStr(String s, int k) 
    {
  
        // Check if the complete String
        // contains K unique characters
        HashSet<char> uni = new HashSet<char>();
        foreach (char x in s.ToCharArray())
            uni.Add(x);
        if (uni.Count < k)
            return -1;
  
        // Size of the String
        int n = s.Length;
  
        // Apply binary search
        int lo = -1, hi = n + 1;
        while (hi - lo > 1) 
        {
            int mid = lo + hi >> 1;
            if (isValidLen(s, mid, k))
                lo = mid;
            else
                hi = mid;
        }
        return lo;
    }
  
    // Driver code
    public static void Main(String[] args) 
    {
        String s = "aabacbebebe";
        int k = 3;
  
        Console.Write(maxLenSubStr(s, k));
    }
}
  
// This code is contributed by Rajput-Ji

chevron_right


Output:

7

Don’t stop now and take your learning to the next level. Learn all the important concepts of Data Structures and Algorithms with the help of the most trusted course: DSA Self Paced. Become industry ready at a student-friendly price.




My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.



Improved By : mohit kumar 29, Rajput-Ji