Open In App

Minimum sum of squares of character counts in a given string after removing k characters

Given a string of lowercase alphabets and a number k, the task is to print the minimum value of the string after removal of ‘k’ characters. The value of a string is defined as the sum of squares of the count of each distinct character. 
For example consider the string “saideep”, here frequencies of characters are s-1, a-1, i-1, e-2, d-1, p-1 and value of the string is 1^2 + 1^2 + 1^2 + 1^2 + 1^2 + 2^2 = 9.
Expected Time Complexity: O(k*logn)

Examples: 

Input :  str = abccc, K = 1
Output :  6
Explanation: We remove c to get the value as 12 + 12 + 22

Input :  str = aaab, K = 2
Output :  2

Asked In: Amazon

Recommended Practice

One clear observation is that we need to remove character with highest frequency. One trick is the character ma
A Simple solution is to use sorting technique through all current highest frequency reduce up to k times. For After every reduce again sort frequency array. 

A Better Solution used to Priority Queue which has to the highest element on top. 

  1. Initialize empty priority queue.
  2. Count frequency of each character and Store into temp array.
  3. Remove K characters which have highest frequency from queue.
  4. Finally Count Sum of square of each element and return it.

Below is the implementation of the above idea. 




// C++ program to find min sum of squares
// of characters after k removals
#include <bits/stdc++.h>
using namespace std;
 
const int MAX_CHAR = 26;
 
// Main Function to calculate min sum of
// squares of characters after k removals
int minStringValue(string str, int k)
{
    int l = str.length(); // find length of string
 
    // if K is greater than length of string
    // so reduced string will become 0
    if (k >= l)
        return 0;
 
    // Else find Frequency of each character and
    // store in an array
    int frequency[MAX_CHAR] = { 0 };
    for (int i = 0; i < l; i++)
        frequency[str[i] - 'a']++;
 
    // Push each char frequency into a priority_queue
    priority_queue<int> q;
    for (int i = 0; i < MAX_CHAR; i++)
        q.push(frequency[i]);
 
    // Removal of K characters
    while (k--) {
        // Get top element in priority_queue,
        // remove it. Decrement by 1 and again
        // push into priority_queue
        int temp = q.top();
        q.pop();
        temp = temp - 1;
        q.push(temp);
    }
 
    // After removal of K characters find sum
    // of squares of string Value
    int result = 0; // Initialize result
    while (!q.empty()) {
        int temp = q.top();
        result += temp * temp;
        q.pop();
    }
 
    return result;
}
 
// Driver Code
int main()
{
    string str = "abbccc"; // Input 1
    int k = 2;
    cout << minStringValue(str, k) << endl;
 
    str = "aaab"; // Input 2
    k = 2;
    cout << minStringValue(str, k);
 
    return 0;
}




// Java program to find min sum of squares
// of characters after k removals
import java.util.Comparator;
import java.util.PriorityQueue;
import java.util.Collections;
public class GFG {
 
    static final int MAX_CHAR = 26;
 
    // Main Function to calculate min sum of
    // squares of characters after k removals
    static int minStringValue(String str, int k)
    {
        int l = str.length(); // find length of string
 
        // if K is greater than length of string
        // so reduced string will become 0
        if (k >= l)
            return 0;
 
        // Else find Frequency of each character and
        // store in an array
        int[] frequency = new int[MAX_CHAR];
        for (int i = 0; i < l; i++)
            frequency[str.charAt(i) - 'a']++;
 
        // creating a priority queue with comparator
        // such that elements in the queue are in
        // descending order.
        PriorityQueue<Integer> q = new PriorityQueue<>(Collections.reverseOrder());
 
        // Push each char frequency into a priority_queue
        for (int i = 0; i < MAX_CHAR; i++) {
            if (frequency[i] != 0)
                q.add(frequency[i]);
        }
 
        // Removal of K characters
        while (k != 0) {
            // Get top element in priority_queue,
            // remove it. Decrement by 1 and again
            // push into priority_queue
            q.add(q.poll() - 1);
            k--;
        }
 
        // After removal of K characters find sum
        // of squares of string Value
        int result = 0; // Initialize result
        while (!q.isEmpty()) {
            result += q.peek() * q.poll();
        }
 
        return result;
    }
 
    // Driver Code
    public static void main(String args[])
    {
        String str = "abbccc"; // Input 1
        int k = 2;
        System.out.println(minStringValue(str, k));
 
        str = "aaab"; // Input 2
        k = 2;
        System.out.println(minStringValue(str, k));
    }
}
// This code is contributed by Sumit Ghosh




# Python3 program to find min sum of
# squares of characters after k removals
from queue import PriorityQueue
 
MAX_CHAR = 26
 
# Main Function to calculate min sum of
# squares of characters after k removals
def minStringValue(str, k):
    l = len(str) # find length of string
 
    # if K is greater than length of string
    # so reduced string will become 0
    if(k >= l):
        return 0
     
    # Else find Frequency of each
    # character and store in an array
    frequency = [0] * MAX_CHAR
    for i in range(0, l):
        frequency[ord(str[i]) - 97] += 1
 
    # Push each char frequency negative
    # into a priority_queue as the queue
    # by default is minheap
    q = PriorityQueue()
    for i in range(0, MAX_CHAR):
        q.put(-frequency[i])
 
    # Removal of K characters
    while(k > 0):
         
        # Get top element in priority_queue
        # multiply it by -1 as temp is negative
        # remove it. Increment by 1 and again
        # push into priority_queue
        temp = q.get()
        temp = temp + 1
        q.put(temp, temp)
        k = k - 1
 
    # After removal of K characters find
    # sum of squares of string Value    
    result = 0; # initialize result
    while not q.empty():
        temp = q.get()
        temp = temp * (-1)
        result += temp * temp
    return result
 
# Driver Code
if __name__ == "__main__":
    str = "abbccc"
    k = 2
    print(minStringValue(str, k))
 
    str = "aaab"
    k = 2
    print(minStringValue(str, k))
     
# This code is contributed
# by Sairahul Jella




// C# program to find min sum of squares
// of characters after k removals
using System;
using System.Collections.Generic;
class GFG {
 
  static readonly int MAX_CHAR = 26;
 
  // Main Function to calculate min sum of
  // squares of characters after k removals
  static int minStringValue(String str, int k)
  {
    int l = str.Length; // find length of string
 
    // if K is greater than length of string
    // so reduced string will become 0
    if (k >= l)
      return 0;
 
    // Else find Frequency of each character and
    // store in an array
    int[] frequency = new int[MAX_CHAR];
    for (int i = 0; i < l; i++)
      frequency[str[i] - 'a']++;
 
    // creating a priority queue with comparator
    // such that elements in the queue are in
    // descending order.
    List<int> q = new List<int>();
 
    // Push each char frequency into a priority_queue
    for (int i = 0; i < MAX_CHAR; i++)
    {
      if (frequency[i] != 0)
        q.Add(frequency[i]);
    }
 
    // Removal of K characters
    while (k != 0)
    {
      q.Sort();
      q.Reverse();
 
      // Get top element in priority_queue,
      // remove it. Decrement by 1 and again
      // push into priority_queue
      q.Add(q[0] - 1);
      q.RemoveAt(0);
      k--;
    }
 
    // After removal of K characters find sum
    // of squares of string Value
    int result = 0; // Initialize result
    while (q.Count != 0)
    {
      result += q[0] * q[0];
      q.RemoveAt(0);
    }
    return result;
  }
 
  // Driver Code
  public static void Main(String []args)
  {
    String str = "abbccc"; // Input 1
    int k = 2;
    Console.WriteLine(minStringValue(str, k));
    str = "aaab"; // Input 2
    k = 2;
    Console.WriteLine(minStringValue(str, k));
  }
}
 
// This code is contributed by gauravrajput1




<script>
 
// JavaScript program to find min sum of squares
// of characters after k removals
 
let MAX_CHAR = 26;
 
// Main Function to calculate min sum of
    // squares of characters after k removals
function minStringValue(str,k)
{
    let l = str.length; // find length of string
  
        // if K is greater than length of string
        // so reduced string will become 0
        if (k >= l)
            return 0;
  
        // Else find Frequency of each character and
        // store in an array
        let frequency = new Array(MAX_CHAR);
        for(let i=0;i<MAX_CHAR;i++)
            frequency[i]=0;
        for (let i = 0; i < l; i++)
            frequency[str[i].charCodeAt(0) - 'a'.charCodeAt(0)]++;
  
        // creating a priority queue with comparator
        // such that elements in the queue are in
        // descending order.
        let q = [];
  
        // Push each char frequency into a priority_queue
        for (let i = 0; i < MAX_CHAR; i++) {
            if (frequency[i] != 0)
                q.push(frequency[i]);
        }
         
        q.sort(function(a,b){return b-a;});
  
        // Removal of K characters
        while (k != 0) {
            // Get top element in priority_queue,
            // remove it. Decrement by 1 and again
            // push into priority_queue
            q.push(q.shift() - 1);
            q.sort(function(a,b){return b-a;});
            k--;
        }
  
        // After removal of K characters find sum
        // of squares of string Value
        let result = 0; // Initialize result
        while (q.length!=0) {
            result += q[0] * q.shift();
        }
  
        return result;
}
 
// Driver Code
let str = "abbccc"; // Input 1
let k = 2;
document.write(minStringValue(str, k)+"<br>");
 
str = "aaab"; // Input 2
k = 2;
document.write(minStringValue(str, k)+"<br>");
 
// This code is contributed by unknown2108
 
</script>

Output
6
2

Time Complexity: O(k*logn)
Auxiliary Space: O(N) because constant size array but priority_queue is storing characters almost the length of the string in worst case.

Efficient Approach :

We can solve it in O(N) time complexity as we need to be greedy and always remove the characters of alphabets which are higher in frequency.

Example: Let str=”abbccc” and k=2 now, alphabetCount[1]=1;//for ‘a’ alphabetCount[2]=2;//for ‘b’ alphabetCount[3]=3;//for ‘c’ maximum=3 m[1]=1(only a occur 1 times) m[2]=1(only b occur 2 times) m[3]=1(only c occur 3 times) //k=2 maximum=3 so k=k-m[maximum]//k=k-1; so now one c got removes so frequencies are a=1,b=2,c=2; so as c’s frequency got decreased by one m[maximum] will be zero and m[maximum-1] will be increased by m[maximum] so update m[2]+=m[3], m[3]=0; also maximum gets decreased by one as it is guaranteed to exist frequency one less than maximum from above. m[1]=1 , m[2]=2 , m[3]=0 and k=1; now m[maximum](i.e m[2]=2>k) so we should partially remove remove one character of either b or c so m[1]=2 0,m[2]=1 ,m[3]=0 and k=0; so, (1*1)*2 + (2*2)*1 + (3*3)*0 = 6//ans

Implementation:




// C++ program to find min sum of squares
// of characters after k removals
#include <bits/stdc++.h>
using namespace std;
 
const int MAX_CHAR = 26;
 
// Main Function to calculate min sum of
// squares of characters after k removals
int minStringValue(string str, int k)
{
    int alphabetCount[MAX_CHAR]= {0};
 
    // Here the array stored frequency the number of
    // occurrences in string m[frequency]=number of alphabets
    // with frequency i.e, in our example abbccc m[1]=1(1
    // a's occur),m[2]=1(2 b's occur),m[3]=1(3 c'soccur)
    int m[str.length()] = { 0 };
   
    for (int i = 0; i < str.length(); i++) {
        alphabetCount[str[i] - 'a']++;
    }
    // Store the maximum
    int maximum = 0;
   
    for (int i = 0; i < MAX_CHAR; i++) {
        m[alphabetCount[i]]++;
        maximum = max(maximum, alphabetCount[i]);
    }
 
    while (k > 0) {
        int z = m[maximum];
        if (z <= k) {
            // Remove one occurrence of alphabet from each
            // with frequency as maximum.
            // So we will have k-z more remove operations to
            // perform as z is number of characters and we
            // perform one removal from each of the alphabet
            // with that frequency.
            k = k - z;
            // As we removed one occurrence from each the
            // alphabets will no longer have the frequency
            // of maximum their frequency will be decreased
            // by one so add these number of alphabets to
            // group with frequency one less than maximum.
            // Remove them from maximum count.
            m[maximum] = 0;
            // Add those to frequency one less.
            m[maximum - 1] += z;
            // new maximum will be one less.
            maximum--;
        }
        else {
            // if all the elements of that frequency cannot
            // be removed we should partially remove them.
            m[maximum] -= k;
            maximum--;
            m[maximum] += k;
            k = 0;
        }
    }
 
    int ans = 0;
    for (int i = 0; i < str.length(); i++) {
        //(square of frequency)*(number of
        // characters corresponding to that frequency)
        ans += (i * i) * m[i];
    }
 
    return ans;
}
 
// Driver Code
int main()
{
    string str = "abbccc"; // Input 1
    int k = 2;
    cout << minStringValue(str, k) << endl;
 
    str = "aaab"; // Input 2
    k = 2;
    cout << minStringValue(str, k);
 
    return 0;
}
 
// This code is contributed by Kasina Dheeraj.




// Java program to find min sum of squares
// of characters after k removals
import java.util.Collections;
import java.util.Comparator;
import java.util.PriorityQueue;
public class GFG {
 
    static final int MAX_CHAR = 26;
 
    // Main Function to calculate min sum of
    // squares of characters after k removals
    static int minStringValue(String str, int k)
    {
        int[] alphabetCount = new int[MAX_CHAR];
        // Here the array stored frequency the number of
        // occurrences in string m[frequency]=number of
        // alphabets with frequency i.e, in our example
        // abbccc m[1]=1(1 a's occur),m[2]=1(2 b's
        // occur),m[3]=1(3 c'soccur)
        int[] m = new int[str.length()];
       
        for (int i = 0; i < str.length(); i++) {
            alphabetCount[str.charAt(i) - 'a']++;
        }
        // Store the maximum
        int maximum = 0;
       
        for (int i = 0; i < MAX_CHAR; i++) {
            m[alphabetCount[i]]++;
            maximum = Math.max(maximum, alphabetCount[i]);
        }
       
        while (k > 0) {
            int z = m[maximum];
            if (z <= k) {
                // Remove one occurrence of alphabet from
                // each with frequency as maximum. So we
                // will have k-z more remove operations to
                // perform as z is number of characters and
                // we perform one removal from each of the
                // alphabet with that frequency.
                k = k - z;
                // As we removed one occurrence from each the
                // alphabets will no longer have the
                // frequency of maximum their frequency will
                // be decreased by one so add these number
                // of alphabets to group with frequency one
                // less than maximum. Remove them from
                // maximum count.
                m[maximum] = 0;
                // Add those to frequency one less.
                m[maximum - 1] += z;
                // new maximum will be one less.
                maximum--;
            }
            else {
                // if all the elements of that frequency
                // cannot be removed we should partially
                // remove them.
                m[maximum] -= k;
                maximum--;
                m[maximum] += k;
                k = 0;
            }
        }
 
        int ans = 0;
        for (int i = 0; i < str.length(); i++) {
            //(square of frequency)*(number of
            // characters corresponding to that frequency)
            ans += (i * i) * m[i];
        }
 
        return ans;
    }
 
    // Driver Code
    public static void main(String args[])
    {
        String str = "abbccc"; // Input 1
        int k = 2;
        System.out.println(minStringValue(str, k));
 
        str = "aaab"; // Input 2
        k = 2;
        System.out.println(minStringValue(str, k));
    }
}
// This code is contributed by Kasina Dheeraj.




# Python program to find min sum of squares
# of characters after k removals
MAX_CHAR = 26
 
# Main Function to calculate min sum of
# squares of characters after k removals
 
def minStringValue(str, k):
   
    alphabetCount =[]
    for i in range(MAX_CHAR):
      alphabetCount.append(0)
 
    # Here the array stored frequency the number of
    # occurrences in string m[frequency]=number of alphabets
    # with frequency i.e, in our example abbccc m[1]=1(1
    # a's occur),m[2]=1(2 b's occur),m[3]=1(3 c'soccur)
    m = []
    for i in range(len(str)):
        m.append(0)
 
    for i in range(len(str)):
        alphabetCount[ord(str[i]) - ord('a')] += 1
 
    # Store the maximum
    maximum = 0
 
    for i in range(MAX_CHAR):
        m[alphabetCount[i]] += 1
        maximum = max(maximum, alphabetCount[i])
 
    while (k > 0):
        z = m[maximum]
        if z <= k:
            # Remove one occurrence of alphabet from each
            # with frequency as maximum.
            # So we will have k-z more remove operations to
            # perform as z is number of characters and we
            # perform one removal from each of the alphabet
            # with that frequency.
            k = k - z
            # As we removed one occurrence from each the
            # alphabets will no longer have the frequency
            # of maximum their frequency will be decreased
            # by one so add these number of alphabets to
            # group with frequency one less than maximum.
            # Remove them from maximum count.
            m[maximum] = 0
            # Add those to frequency one less.
            m[maximum - 1] += z
            # new maximum will be one less.
            maximum -= 1
 
        else:
            # if all the elements of that frequency cannot
            # be removed we should partially remove them.
            m[maximum] -= k
            maximum -= 1
            m[maximum] += k
            k = 0
 
    ans = 0
    for i in range(len(str)):
        # (square of frequency)*(number of
        # characters corresponding to that frequency)
        ans = ans + (i * i) * m[i]
 
    return ans
 
# Driver Code
str = "abbccc"  # Input 1
k = 2
 
print(minStringValue(str, k))
 
str = "aaab"  # Input 2
k = 2
print(minStringValue(str, k))
 
 
# This code is contributed by Abhijeet Kumar(abhijeet19403)




// C# program to find min sum of squares
// of characters after k removals
using System;
 
public static class GFG {
  static int MAX_CHAR = 26;
 
  // Main Function to calculate min sum of
  // squares of characters after k removals
  static int minStringValue(string str, int k)
  {
    int[] alphabetCount = new int[MAX_CHAR];
 
    // Here the array stored frequency the number of
    // occurrences in string m[frequency]=number of
    // alphabets with frequency i.e, in our example
    // abbccc m[1]=1(1 a's occur),m[2]=1(2 b's
    // occur),m[3]=1(3 c'soccur)
    int[] m = new int[str.Length];
 
    for (int i = 0; i < str.Length; i++) {
      alphabetCount[str[i] - 'a']++;
    }
    // Store the maximum
    int maximum = 0;
 
    for (int i = 0; i < MAX_CHAR; i++) {
      m[alphabetCount[i]]++;
      maximum = Math.Max(maximum, alphabetCount[i]);
    }
 
    while (k > 0) {
      int z = m[maximum];
      if (z <= k) {
        // Remove one occurrence of alphabet from
        // each with frequency as maximum. So we
        // will have k-z more remove operations to
        // perform as z is number of characters and
        // we perform one removal from each of the
        // alphabet with that frequency.
        k = k - z;
        // As we removed one occurrence from each
        // the alphabets will no longer have the
        // frequency of maximum their frequency will
        // be decreased by one so add these number
        // of alphabets to group with frequency one
        // less than maximum. Remove them from
        // maximum count.
        m[maximum] = 0;
        // Add those to frequency one less.
        m[maximum - 1] += z;
        // new maximum will be one less.
        maximum--;
      }
      else {
        // if all the elements of that frequency
        // cannot be removed we should partially
        // remove them.
        m[maximum] -= k;
        maximum--;
        m[maximum] += k;
        k = 0;
      }
    }
 
    int ans = 0;
    for (int i = 0; i < str.Length; i++) {
      //(square of frequency)*(number of
      // characters corresponding to that frequency)
      ans += (i * i) * m[i];
    }
 
    return ans;
  }
 
  // Driver Code
  public static void Main()
  {
    string str = "abbccc"; // Input 1
    int k = 2;
    Console.Write(minStringValue(str, k));
    Console.Write("\n");
 
    str = "aaab"; // Input 2
    k = 2;
    Console.Write(minStringValue(str, k));
  }
}
 
  // This code is contributed by Aarti_Rathi




<script>
   
// JavaScript program to find min sum of squares
// of characters after k removals
   
let MAX_CHAR = 26;
   
// Main Function to calculate min sum of
// squares of characters after k removals
function minStringValue(str,k)
{
    var alphabetCount = new Array(MAX_CHAR).fill(0);
     
    // Here the array stored frequency the number of
    // occurrences in string m[frequency]=number of alphabets
    // with frequency i.e, in our example abbccc m[1]=1(1
    // a's occur),m[2]=1(2 b's occur),m[3]=1(3 c'soccur)
    var m = new Array(str.length).fill(0);
    var i;
    for (i = 0; i < str.length; i++) {
        alphabetCount[str.charCodeAt(i) - 97]++;
    }
    // Store the maximum
    var maximum = 0;
     
    for (i = 0; i < MAX_CHAR; i++) {
        m[alphabetCount[i]]++;
        maximum = Math.max(maximum, alphabetCount[i]);
    }      
             
    while (k > 0) {
        var z = m[maximum];
        if (z <= k) {
            // Remove one occurrence of alphabet from each
            // with frequency as maximum.
            // So we will have k-z more remove operations to
            // perform as z is number of characters and we
            // perform one removal from each of the alphabet
            // with that frequency.
            k = k - z;
            // As we removed one occurrence from each the
            // alphabets will no longer have the frequency
            // of maximum their frequency will be decreased
            // by one so add these number of alphabets to
            // group with frequency one less than maximum.
            // Remove them from maximum count.
            m[maximum] = 0;
            // Add those to frequency one less.
            m[maximum - 1] += z;
            // new maximum will be one less.
            maximum--;
        }
        else {
            // if all the elements of that frequency cannot
            // be removed we should partially remove them.
            m[maximum] -= k;
            maximum--;
            m[maximum] += k;
            k = 0;
        }
    }
   
    var ans = 0;
    for (i = 0; i < str.length; i++) {
        //(square of frequency)*(number of
        // characters corresponding to that frequency)
        ans += (i * i) * m[i];
    }
   
    return ans;
}
   
// Driver Code
let str = "abbccc"; // Input 1
let k = 2;
document.write(minStringValue(str, k)+"<br>");
   
str = "aaab"; // Input 2
k = 2;
document.write(minStringValue(str, k)+"<br>");
   
// This code is contributed by aditya942003patil
   
</script>

Output
6
2

Time Complexity: O(N)
Space Complexity: O(N)

 


Article Tags :