Median and Mode using Counting Sort

2

Given an n sized unsorted array, find median and mode using counting sort technique. Thia can be useful when array elements are in limited range.

Examples:

Input : array a[] = {1, 1, 1, 2, 7, 1}
Output : Mode = 1
         Median = 1.5

Input : array a[] = {9, 9, 9, 9, 9}
Output : Mode = 9
         Median = 9

Prerequisites: Count Sort, Median of Array, Mode (Most frequent element in array)

1. Auxiliary(count) array before summing its previous counts, c[]:
Index: 0 1 2 3 4 5 6 7 8 9 10
count: 0 4 1 0 0 0 0 1 0 0 0

2. Mode = index with maximum value of count.
Mode = 1(for above example)

3. count array is modified similarly as it is done while performing count sort.
Index: 0 1 2 3 4 5 6 7 8 9 10
count: 0 3 5 6 7 8 9 10 10 10 10

4. output array is calculated normally as in count sort, b[]:
output array b[] = {1, 1, 1, 2, 2, 3, 4, 5, 6, 7}

5. If size of array b[] is odd, Median = b[n/2]
Else Median = (b[(n-1)/2] + b[n/2])/2

6. For above example size of b[] is even hence, Median = (b[4] + b[5])/2.
Median = (2 + 3)/2 = 2.5

Basic Approach to be followed :
Assuming size of input array is n:
Step1: Take the count array before summing its previous counts into next index.
Step2: The index with maximum value stored in it is the mode of given data.
Step3: In case there are more than one indexes with maximum value in it, all are results for mode so we can take any.
Step4: Store the value at that index in a separate variable called mode.
Step5: Continue with the normal processing of the count sort.
Step6: In the resultant(sorted) array, if n is odd then median = middle-most element of the
sorted array, And if n is even the median = average of two middle-most elements of the sorted array.
Step7: Store the result in a separate variable called median.

Following is the implementation of problem discussed above:

    C++

    // C++ Program for Mode and
    // Median using Counting
    // Sort technique
    #include <bits/stdc++.h>
    using namespace std;
    
    // function that sort input array a[] and 
    // calculate mode and median using counting
    // sort.
    void printModeMedian(int a[], int n)
    {
        // The output array b[] will
        // have sorted array
        int b[n];
    
        // variable to store max of
        // input array which will 
        // to have size of count array
        int max = *max_element(a, a+n);
    
        // auxiliary(count) array to 
        // store count. Initialize
        // count array as 0. Size
        // of count array will be
        // equal to (max + 1).
        int t = max + 1;
        int count[t];
        for (int i = 0; i < t; i++)
            count[i] = 0;    
    
        // Store count of each element
        // of input array
        for (int i = 0; i < n; i++)
            count[a[i]]++;    
        
        // mode is the index with maximum count
        int mode = 0;
        int k = count[0];
        for (int i = 1; i < t; i++)
        {
            if (count[i] > k)
            {
                k = count[i];
                mode = i;
            }
        }    
    
        // Update count[] array with sum
        for (int i = 1; i < t; i++)
            count[i] = count[i] + count[i-1];
    
        // Sorted output array b[]
        // to calculate median
        for (int i = 0; i < n; i++)
        {
            b[count[a[i]]-1] = a[i];
            count[a[i]]--;
        }
        
        // Median according to odd and even 
        // array size respectively.
        float median;
        if (n % 2 != 0)
            median = b[n/2];
        else
            median = (b[(n-1)/2] + 
                      b[(n/2)])/2.0;
        
        // Output the result 
        cout << "median = " << median << endl;
        cout << "mode = " << mode;
    }
    
    // Driver program
    int main()
    {
        int a[] = { 1, 4, 1, 2, 7, 1, 2,  5, 3, 6 };
        int n = sizeof(a)/sizeof(a[0]);
        printModeMedian(a, n);
        return 0;
    }
    

    Output:

    median = 2.5
    mode = 1
    

Time Complexity = O(N + P), where N is the time for input array and P is time for count array.
Space Complexity = O(P), where P is the size of auxiliary array.


Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.

GATE CS Corner    Company Wise Coding Practice

Recommended Posts:



2 Average Difficulty : 2/5.0
Based on 1 vote(s)