# Find top k (or most frequent) numbers in a stream

Given an array of n numbers. Your task is to read numbers from the array and keep at-most K numbers at the top (According to their decreasing frequency) every time a new number is read. We basically need to print top k numbers sorted by frequency when input stream has included k distinct elements, else need to print all distinct elements sorted by frequency.

Examples:

Input : arr[] = {5, 2, 1, 3, 2}
k = 4
Output : 5 2 5 1 2 5 1 2 3 5 2 1 3 5
Explanation:

1. After reading 5, there is only one element 5 whose frequency is max till now.
so print 5.
2. After reading 2, we will have two elements 2 and 5 with the same frequency.
As 2, is smaller than 5 but their frequency is the same so we will print 2 5.
3. After reading 1, we will have 3 elements 1, 2 and 5 with the same frequency,
so print 1 2 5.
4. Similarly after reading 3, print 1 2 3 5
5. After reading last element 2 since 2 has already occurred so we have now a
frequency of 2 as 2. So we keep 2 at the top and then rest of the element
with the same frequency in sorted order. So print, 2 1 3 5.

Input : arr[] = {5, 2, 1, 3, 4}
k = 4
Output : 5 2 5 1 2 5 1 2 3 5 1 2 3 4
Explanation:

1. After reading 5, there is only one element 5 whose frequency is max till now.
so print 5.
2. After reading 2, we will have two elements 2 and 5 with the same frequency.
As 2, is smaller than 5 but their frequency is the same so we will print 2 5.
3. After reading 1, we will have 3 elements 1, 2 and 5 with the same frequency,
so print 1 2 5.
Similarly after reading 3, print 1 2 3 5
4. After reading last element 4, All the elements have same frequency
So print, 1 2 3 4.

## Recommended: Please solve it on “PRACTICE” first, before moving on to the solution.

Approach: The idea is to store the top k elements with maximum frequency. To store them a vector or an array can be used. To keep the track of frequencies of elements create a HashMap to store element-frequency pair. Given a stream of numbers, when a new element appears in the stream update the frequency of that element in HashMap and put that element at the end of the list of K numbers (total k+1 elements) now compare adjacent elements of the list and swap if higher frequency element is stored next to it.

Algorithm:

1. Create a Hashmap hm, and an array of k + 1 length.
2. Traverse the input array from start to end.
3. Insert the element at k+1 th position of the array, update the frequency of that element in HashMap.
4. Now, traverse the temp array from start to end – 1
5. For very element, compare the frequency and swap if higher frequency element is stored next to it, if the frequency is same then swap is the next element is greater.
6. print the top k element in each traversal of original array.

Implementation:

## C++

 `// C++ program to find top k elements in a stream ` `#include ` `using` `namespace` `std; ` ` `  `// Function to print top k numbers ` `void` `kTop(``int` `a[], ``int` `n, ``int` `k) ` `{ ` `    ``// vector of size k+1 to store elements ` `    ``vector<``int``> top(k + 1); ` ` `  `    ``// array to keep track of frequency ` `    ``unordered_map<``int``, ``int``> freq; ` ` `  `    ``// iterate till the end of stream ` `    ``for` `(``int` `m = 0; m < n; m++) { ` `        ``// increase the frequency ` `        ``freq[a[m]]++; ` ` `  `        ``// store that element in top vector ` `        ``top[k] = a[m]; ` ` `  `        ``// search in top vector for same element ` `        ``auto` `it = find(top.begin(), top.end() - 1, a[m]); ` ` `  `        ``// iterate from the position of element to zero ` `        ``for` `(``int` `i = distance(top.begin(), it) - 1; i >= 0; --i) { ` `            ``// compare the frequency and swap if higher ` `            ``// frequency element is stored next to it ` `            ``if` `(freq[top[i]] < freq[top[i + 1]]) ` `                ``swap(top[i], top[i + 1]); ` ` `  `            ``// if frequency is same compare the elements ` `            ``// and swap if next element is high ` `            ``else` `if` `((freq[top[i]] == freq[top[i + 1]]) ` `                     ``&& (top[i] > top[i + 1])) ` `                ``swap(top[i], top[i + 1]); ` `            ``else` `                ``break``; ` `        ``} ` ` `  `        ``// print top k elements ` `        ``for` `(``int` `i = 0; i < k && top[i] != 0; ++i) ` `            ``cout << top[i] << ``' '``; ` `    ``} ` `    ``cout << endl; ` `} ` ` `  `// Driver program to test above function ` `int` `main() ` `{ ` `    ``int` `k = 4; ` `    ``int` `arr[] = { 5, 2, 1, 3, 2 }; ` `    ``int` `n = ``sizeof``(arr) / ``sizeof``(arr); ` `    ``kTop(arr, n, k); ` `    ``return` `0; ` `} `

## Java

 `import` `java.io.*; ` `import` `java.util.*; ` `class` `GFG { ` ` `  `    ``// function to search in top vector for element ` `    ``static` `int` `find(``int``[] arr, ``int` `ele) ` `    ``{ ` `        ``for` `(``int` `i = ``0``; i < arr.length; i++) ` `            ``if` `(arr[i] == ele) ` `                ``return` `i; ` `        ``return` `-``1``; ` `    ``} ` ` `  `    ``// Function to print top k numbers ` `    ``static` `void` `kTop(``int``[] a, ``int` `n, ``int` `k) ` `    ``{ ` `        ``// vector of size k+1 to store elements ` `        ``int``[] top = ``new` `int``[k + ``1``]; ` ` `  `        ``// array to keep track of frequency ` `        ``HashMap freq = ``new` `HashMap<>(); ` `        ``for` `(``int` `i = ``0``; i < k + ``1``; i++) ` `            ``freq.put(i, ``0``); ` ` `  `        ``// iterate till the end of stream ` `        ``for` `(``int` `m = ``0``; m < n; m++) { ` `            ``// increase the frequency ` `            ``if` `(freq.containsKey(a[m])) ` `                ``freq.put(a[m], freq.get(a[m]) + ``1``); ` `            ``else` `                ``freq.put(a[m], ``1``); ` ` `  `            ``// store that element in top vector ` `            ``top[k] = a[m]; ` ` `  `            ``// search in top vector for same element ` `            ``int` `i = find(top, a[m]); ` `            ``i -= ``1``; ` ` `  `            ``// iterate from the position of element to zero ` `            ``while` `(i >= ``0``) { ` `                ``// compare the frequency and swap if higher ` `                ``// frequency element is stored next to it ` `                ``if` `(freq.get(top[i]) < freq.get(top[i + ``1``])) { ` `                    ``int` `temp = top[i]; ` `                    ``top[i] = top[i + ``1``]; ` `                    ``top[i + ``1``] = temp; ` `                ``} ` ` `  `                ``// if frequency is same compare the elements ` `                ``// and swap if next element is high ` `                ``else` `if` `((freq.get(top[i]) == freq.get(top[i + ``1``])) && (top[i] > top[i + ``1``])) { ` `                    ``int` `temp = top[i]; ` `                    ``top[i] = top[i + ``1``]; ` `                    ``top[i + ``1``] = temp; ` `                ``} ` ` `  `                ``else` `                    ``break``; ` `                ``i -= ``1``; ` `            ``} ` ` `  `            ``// print top k elements ` `            ``for` `(``int` `j = ``0``; j < k && top[j] != ``0``; ++j) ` `                ``System.out.print(top[j] + ``" "``); ` `        ``} ` `        ``System.out.println(); ` `    ``} ` ` `  `    ``// Driver program to test above function ` `    ``public` `static` `void` `main(String args[]) ` `    ``{ ` `        ``int` `k = ``4``; ` `        ``int``[] arr = { ``5``, ``2``, ``1``, ``3``, ``2` `}; ` `        ``int` `n = arr.length; ` `        ``kTop(arr, n, k); ` `    ``} ` `} ` ` `  `// This code is contributed by rachana soma `

## Python

 `# Python program to find top k elements in a stream ` ` `  `# Function to print top k numbers ` `def` `kTop(a, n, k): ` ` `  `    ``# list of size k + 1 to store elements ` `    ``top ``=` `[``0` `for` `i ``in` `range``(k ``+` `1``)] ` `  `  `    ``# dictionary to keep track of frequency ` `    ``freq ``=` `{i:``0` `for` `i ``in` `range``(k ``+` `1``)} ` ` `  `    ``# iterate till the end of stream ` `    ``for` `m ``in` `range``(n): ` ` `  `        ``# increase the frequency ` `        ``if` `a[m] ``in` `freq.keys(): ` `            ``freq[a[m]] ``+``=` `1` `        ``else``: ` `            ``freq[a[m]] ``=` `1` ` `  `        ``# store that element in top vector ` `        ``top[k] ``=` `a[m] ` `  `  `        ``i ``=` `top.index(a[m]) ` `        ``i ``-``=` `1` `         `  `        ``while` `i >``=` `0``: ` ` `  `            ``# compare the frequency and swap if higher ` `            ``# frequency element is stored next to it ` `            ``if` `(freq[top[i]] < freq[top[i ``+` `1``]]): ` `                ``t ``=` `top[i] ` `                ``top[i] ``=` `top[i ``+` `1``] ` `                ``top[i ``+` `1``] ``=` `t ` `             `  `            ``# if frequency is same compare the elements ` `            ``# and swap if next element is high ` `            ``elif` `((freq[top[i]] ``=``=` `freq[top[i ``+` `1``]]) ``and` `(top[i] > top[i ``+` `1``])): ` `                ``t ``=` `top[i] ` `                ``top[i] ``=` `top[i ``+` `1``] ` `                ``top[i ``+` `1``] ``=` `t ` `            ``else``: ` `                ``break` `            ``i ``-``=` `1` `         `  `        ``# print top k elements ` `        ``i ``=` `0` `        ``while` `i < k ``and` `top[i] !``=` `0``: ` `            ``print` `top[i], ` `            ``i ``+``=` `1` `    ``print` `  `  `# Driver program to test above function ` `k ``=` `4` `arr ``=` `[ ``5``, ``2``, ``1``, ``3``, ``2` `] ` `n ``=` `len``(arr) ` `kTop(arr, n, k) ` ` `  `# This code is contributed by Sachin Bisht `

## C#

 `// C# program to find top k elements in a stream ` `using` `System; ` `using` `System.Collections.Generic; ` ` `  `class` `GFG { ` `    ``// function to search in top vector for element ` `    ``static` `int` `find(``int``[] arr, ``int` `ele) ` `    ``{ ` `        ``for` `(``int` `i = 0; i < arr.Length; i++) ` `            ``if` `(arr[i] == ele) ` `                ``return` `i; ` `        ``return` `-1; ` `    ``} ` ` `  `    ``// Function to print top k numbers ` `    ``static` `void` `kTop(``int``[] a, ``int` `n, ``int` `k) ` `    ``{ ` `        ``// vector of size k+1 to store elements ` `        ``int``[] top = ``new` `int``[k + 1]; ` ` `  `        ``// array to keep track of frequency ` `        ``Dictionary<``int``, ` `                   ``int``> ` `            ``freq = ``new` `Dictionary<``int``, ` `                                  ``int``>(); ` `        ``for` `(``int` `i = 0; i < k + 1; i++) ` `            ``freq.Add(i, 0); ` ` `  `        ``// iterate till the end of stream ` `        ``for` `(``int` `m = 0; m < n; m++) { ` `            ``// increase the frequency ` `            ``if` `(freq.ContainsKey(a[m])) ` `                ``freq[a[m]]++; ` `            ``else` `                ``freq.Add(a[m], 1); ` ` `  `            ``// store that element in top vector ` `            ``top[k] = a[m]; ` ` `  `            ``// search in top vector for same element ` `            ``int` `i = find(top, a[m]); ` `            ``i--; ` ` `  `            ``// iterate from the position of element to zero ` `            ``while` `(i >= 0) { ` `                ``// compare the frequency and swap if higher ` `                ``// frequency element is stored next to it ` `                ``if` `(freq[top[i]] < freq[top[i + 1]]) { ` `                    ``int` `temp = top[i]; ` `                    ``top[i] = top[i + 1]; ` `                    ``top[i + 1] = temp; ` `                ``} ` ` `  `                ``// if frequency is same compare the elements ` `                ``// and swap if next element is high ` `                ``else` `if` `(freq[top[i]] == freq[top[i + 1]] && top[i] > top[i + 1]) { ` `                    ``int` `temp = top[i]; ` `                    ``top[i] = top[i + 1]; ` `                    ``top[i + 1] = temp; ` `                ``} ` `                ``else` `                    ``break``; ` ` `  `                ``i--; ` `            ``} ` ` `  `            ``// print top k elements ` `            ``for` `(``int` `j = 0; j < k && top[j] != 0; ++j) ` `                ``Console.Write(top[j] + ``" "``); ` `        ``} ` `        ``Console.WriteLine(); ` `    ``} ` ` `  `    ``// Driver Code ` `    ``public` `static` `void` `Main(String[] args) ` `    ``{ ` `        ``int` `k = 4; ` `        ``int``[] arr = { 5, 2, 1, 3, 2 }; ` `        ``int` `n = arr.Length; ` `        ``kTop(arr, n, k); ` `    ``} ` `} ` ` `  `// This code is contributed by ` `// sanjeev2552 `

Output:

```5 2 5 1 2 5 1 2 3 5 2 1 3 5
```

Complexity Analysis:

• Time Complexity: O( n * k ).
In each traversal the temp array of size k is traversed, So the time Complexity is O( n * k ).
• Space Complexity:O(n).
To store the elements in HashMap O(n) space is required.

This article is contributed by Niteesh Kumar. If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Attention reader! Don’t stop learning now. Get hold of all the important DSA concepts with the DSA Self Paced Course at a student-friendly price and become industry ready.

My Personal Notes arrow_drop_up

Article Tags :
Practice Tags :

21

Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.