Minimum size of subset od String with frequency more than half of Array
Last Updated :
27 Oct, 2023
Given an Array of Strings (Arr), the task is to find the smallest subset of strings in the array such that the total count of those selected strings exceeds 50% of the size of the original array. In other words, find the minimum set of distinct strings that constitutes over 50% of the array’s elements.
Examples:
Input: Arr = [‘shoes’, ‘face’, ‘pizza’, ‘covid’, ‘shoes’, ‘covid’, ‘covid’, ‘face’, ‘shoes’]
Output: [‘covid’, ‘shoes’]
Explanation: Frequency of the strings is as follows: ‘shoes’ : 3, ‘covid’ : 3, ‘face’ : 2, ‘pizza’ : 1
So ‘shoes’ (3) + ‘covid’ (3) = 6 makes greater than the size of the array.
Input: Arr = [‘java’, ‘python’, ‘java’, ‘python’, ‘python’]
Output: [‘python’]
Explanation: Frequency of the strings is as follows: ‘python’ : 3, ‘java’ : 2.
So ‘python’ (3) makes greater than the size of the array.
Approach #1 :
Iterate through the arr and form a key in dictionary of newly occurred element or if element is already occurred, increase its value by 1 to count the frequency and then sort the dictionary in decreasing order and iterate through the dictionary until we get a subset.
Code:
Below is the implementation of the above approach:
Python3
def min_subset_to_exceed_half(arr):
frequency = {}
max_freq = ( len (arr) / / 2 ) + 1
max_freq_strings = []
for string in arr:
if string in frequency:
frequency[string] + = 1
else :
frequency[string] = 1
sorted_frequency = dict (
sorted (frequency.items(), key = lambda item: item[ 1 ], reverse = True ))
curr_freq = 0
for i in sorted_frequency:
max_freq_strings.append(i)
curr_freq + = sorted_frequency[i]
if curr_freq > = max_freq:
break
return max_freq_strings
arr = ["shoes", "face", "pizza", "covid",
"shoes", "covid", "covid", "face", "shoes"]
print ( * min_subset_to_exceed_half(arr))
|
Javascript
function minSubsetToExceedHalf(arr) {
const frequency = new Map();
const maxFreq = Math.floor(arr.length / 2) + 1;
const maxFreqStrings = [];
for (const string of arr) {
if (frequency.has(string)) {
frequency.set(string, frequency.get(string) + 1);
} else {
frequency.set(string, 1);
}
}
const sortedFrequency = new Map(
[...frequency.entries()].sort((a, b) => b[1] - a[1])
);
let currFreq = 0;
for (const [key, value] of sortedFrequency) {
maxFreqStrings.push(key);
currFreq += value;
if (currFreq >= maxFreq) {
break ;
}
}
return maxFreqStrings;
}
const arr = [ "shoes" , "face" , "pizza" , "covid" , "shoes" , "covid" , "covid" , "face" , "shoes" ];
console.log(minSubsetToExceedHalf(arr).join( ' ' ));
|
Time Complexity: O(N Log N),
Auxiliary Space: O(N), where N represents the number of unique strings in the input array.
Approach #2: Using collections.counter()
The most suggested method that could be used to find all occurrences is this method, which actually gets all element frequencies and could also be used to print single element frequencies if required.
Code:
Below is the implementation of the above approach:
Python3
from collections import Counter
def min_subset_to_exceed_half(arr):
frequency = Counter(arr)
max_freq = ( len (arr) / / 2 ) + 1
max_freq_strings = []
sorted_frequency = dict (
sorted (frequency.items(), key = lambda item: item[ 1 ], reverse = True ))
curr_freq = 0
for i in sorted_frequency:
max_freq_strings.append(i)
curr_freq + = sorted_frequency[i]
if curr_freq > = max_freq:
break
return max_freq_strings
arr = ["shoes", "face", "pizza", "covid",
"shoes", "covid", "covid", "face", "shoes"]
print ( * min_subset_to_exceed_half(arr))
|
Time Complexity: O(N Log N),
Auxiliary Space: O(N), where N represents the number of unique strings in the input array.
Share your thoughts in the comments
Please Login to comment...