Median of two sorted arrays with different sizes in O(log(min(n, m)))

Given two sorted arrays, a[] and b[], task is to find the median of these sorted arrays, in O(log(min(n, m)), when n is the number of elements in the first array, and m is the number of elements in the second array.

Prerequisite : Median of two different sized sorted arrays.

Examples :

Input : ar1[] = {-5, 3, 6, 12, 15}
        ar2[] = {-12, -10, -6, -3, 4, 10}
        The merged array is :
        ar3[] = {-12, -10, -6, -5 , -3,
                 3, 4, 6, 10, 12, 15}
Output : The median is 3.

Input : ar1[] = {2, 3, 5, 8}
        ar2[] = {10, 12, 14, 16, 18, 20}
        The merged array is :
        ar3[] = {2, 3, 5, 8, 10, 12, 14, 16, 18, 20}
        if the number of the elements are even, 
        so there are two middle elements,
        take the average between the two :
        (10 + 12) / 2 = 11.      
Output : The median is 11.

Note : In case of even numbers in total and if we want to return a median that exist in the merged array we can return the element in the (n+m)/2 or (n+m)/2 – 1 position. In that case the median can be 10 or 12.



Approach : Start partitioning the two arrays into two groups of halves (not two parts, but both partitioned should have same number of elements). The first half contains some first elements from the first and the second arrays, and the second half contains the rest (or the last) elements form the first and the second arrays. Because the arrays can be of different sizes, it does not mean to take every half from each array. The below example clarifies the explanation. Reach a condition such that, every element in the first half is less than or equal to every element in the second half.

How to reach this condition ?
Example in the case of even numbers. Suppose, partition is found. Because A[] and B[] are two sorted arrays, a1 is less than or equal to a2, and b2 is less than or equal to b3. Now, to check if a1 is less than or equal to b3, and if b2 is less than or equal to a2. If that’s the case, it means that every element in the first half is less than or equal to every element in the second half, because, a1 is greater than or equal to every element before it (a0) in A[], and b2 is greater than or equal to every element before it (b1 and b0) in B[]. In case of even numbers in total the median will be the average between max of a1, b2 and the min of a2, b3, but in case of odd numbers in total the median will be the max of a2, b2. But if it is not these two cases, there are two options (in referring to the even numbers example) :
b2 > a2 or a1 > b3
if, b2 > a2 it means that, search on the right side of the array, and if a1 > b3 it means that, search on the left side of the array, until desired condition is found.

Why the above condition leads to the median ?
The median is the (n + 1) / 2 smallest element of the array, and here, the median is the (n + m + 1) / 2 smallest element among the two arrays. If, all the elements in the first half are less than (or equal) to all elements in the second half, in case of odd numbers in total, just calculate the maximum between the last two elements in the first half (a2 and b2 in our example),and this will lead us to the (n + m + 1) / 2 smallest element among the two arrays, which is the median ((7 + 4 + 1) / 2 = 6). But in case of even numbers in total, calculate the average between the maximum of the last two elements in the first half (a1 and b2 in our example) with its successive number among the arrays which is the minimum of first two elements in the second half (a2 and b3 in our example).

The process of the partition :
To make two halves, make the partition such that the index that partitioning array A[] + the index that partitioning array B[] are equal to the total number of elements plus one divided by 2, i.e. (n + m + 1) / 2 (+1 is, if the total number of elements is odd).
First, define two variables : min_index and max_index, and initialize min_index to 0, and max_index to the length of the smaller array. In these below examples A[] is the smaller array.
To partition A[], use the formula (min_index + max_index) / 2 and insert it to a variable i. To partition B[], use the formula (n + m + 1) / 2 – i and insert it to a variable j.
the variable i means the number of elements to be inserted from A[] into the first half, and j means the number of elements to be inserted from B[] into the first half, the rest of the elements will be inserted into the second half.
Take a look at the below examples :
Example 1 :


Example 2 (This example refers to the condition that returns a median that exists in the merged array) :


Below is the implementation of above approach :

C++

// CPP code for median with case of returning 
// double value when even number of elements are 
// present in both array combinely
#include<bits/stdc++.h>
using std::cout;

int maximum(int a, int b);
int minimum(int a, int b);

// Function to find median of two sorted arrays
double findMedianSortedArrays(int *a, int n, 
                              int *b, int m)
{
    
    int min_index = 0, max_index = n, i, j, median;
    
    while (min_index <= max_index)
    {
        i = (min_index + max_index) / 2;
        j = ((n + m + 1) / 2) - i;
    
        // if i = n, it means that Elements from a[] in
        // the second half is an empty set. and if j = 0,
        // it means that Elements from b[] in the first
        // half is an empty set. so it is necessary to
        // check that, because we compare elements from
        // these two groups. 
        // Searching on right
        if (i < n && j > 0 && b[j - 1] > a[i])        
            min_index = i + 1;
                
        // if i = 0, it means that Elements from a[] in
        // the first half is an empty set and if j = m,
        // it means that Elements from b[] in the second
        // half is an empty set. so it is necessary to
        // check that, because we compare elements 
        // from these two groups.
        // searching on left
        else if (i > 0 && j < m && b[j] < a[i - 1])        
            max_index = i - 1;

        // we have found the desired halves.
        else
        {
            // this condition happens when we don't have any
            // elements in the first half from a[] so we
            // returning the last element in b[] from 
            // the first half.
            if (i == 0)            
                median = b[j - 1];            
            
            // and this condition happens when we don't
            // have any elements in the first half from
            // b[] so we returning the last element in 
            // a[] from the first half.
            else if (j == 0)            
                median = a[i - 1];            
            else            
                median = maximum(a[i - 1], b[j - 1]);            
            break;
        }
    }
    
    // calculating the median.
    // If number of elements is odd there is 
    // one middle element.
    if ((n + m) % 2 == 1)
        return (double)median;
        
    // Elements from a[] in the second half is an empty set.    
    if (i == n)
        return (median+b[j]) / 2.0;
        
    // Elements from b[] in the second half is an empty set.
    if (j == m)
        return (median + a[i]) / 2.0;
    
    return (median + minimum(a[i], b[j])) / 2.0;
}

// Function to find max
int maximum(int a, int b) 
{
    return a > b ? a : b;
}

// Function to find minimum
int minimum(int a, int b) 
{
    return a < b ? a : b; 
}

// Driver code
int main()
{
    int a[] = {900};
    int b[] = { 10, 13, 14};
    int n = sizeof(a) / sizeof(int);
    int m = sizeof(b) / sizeof(int);
    
    // we need to define the smaller array as the 
    // first parameter to make sure that the 
    // time complexity will be O(log(min(n,m)))
    if (n < m)
        cout << "The median is : "
             << findMedianSortedArrays(a, n, b, m);
    else
        cout << "The median is : "
             << findMedianSortedArrays(b, m, a, n);

    return 0;
}

Java

// Java code for median with 
// case of returning double 
// value when even number of 
// elements are present in 
// both array combinely
import java.io.*;

class GFG
{
    static int []a = new int[]{900};
    static int []b = new int[]{10, 13, 14};

    // Function to find max
    static int maximum(int a, int b) 
    {
        return a > b ? a : b;
    }
    
    // Function to find minimum
    static int minimum(int a, int b) 
    {
        return a < b ? a : b; 
    }
    
    // Function to find median 
    // of two sorted arrays
    static double findMedianSortedArrays(int n, 
                                         int m)
    {
        
        int min_index = 0, 
            max_index = n, i = 0,
            j = 0, median = 0;
        
        while (min_index <= max_index)
        {
            i = (min_index + max_index) / 2;
            j = ((n + m + 1) / 2) - i;
        
            // if i = n, it means that Elements 
            // from a[] in the second half is an 
            // empty set. and if j = 0, it means 
            // that Elements from b[] in the first
            // half is an empty set. so it is 
            // necessary to check that, because we
            // compare elements from these two 
            // groups. Searching on right
            if (i < n && j > 0 && b[j - 1] > a[i])     
                min_index = i + 1;
                    
            // if i = 0, it means that Elements
            // from a[] in the first half is an 
            // empty set and if j = m, it means 
            // that Elements from b[] in the second
            // half is an empty set. so it is 
            // necessary to check that, because 
            // we compare elements from these two
            // groups. searching on left
            else if (i > 0 && j < m && b[j] < a[i - 1])     
                max_index = i - 1;
    
            // we have found the desired halves.
            else
            {
                // this condition happens when we 
                // don't have any elements in the 
                // first half from a[] so we
                // returning the last element in 
                // b[] from the first half.
                if (i == 0)         
                    median = b[j - 1];         
                
                // and this condition happens when 
                // we don't have any elements in the
                // first half from b[] so we 
                // returning the last element in 
                // a[] from the first half.
                else if (j == 0)         
                    median = a[i - 1];         
                else    
                    median = maximum(a[i - 1], 
                                     b[j - 1]);         
                break;
            }
        }
        
        // calculating the median.
        // If number of elements is odd 
        // there is one middle element.
        if ((n + m) % 2 == 1)
            return (double)median;
            
        // Elements from a[] in the 
        // second half is an empty set. 
        if (i == n)
            return (median + b[j]) / 2.0;
            
        // Elements from b[] in the
        // second half is an empty set.
        if (j == m)
            return (median + a[i]) / 2.0;
        
        return (median + minimum(a[i], 
                                 b[j])) / 2.0;
    }
    
    // Driver code
    public static void main(String args[])
    {
        int n = a.length;
        int m = b.length;
        
        // we need to define the 
        // smaller array as the 
        // first parameter to 
        // make sure that the
        // time complexity will
        // be O(log(min(n,m)))
        if (n < m)
            System.out.print("The median is : " + 
                   findMedianSortedArrays(n, m));
        else
            System.out.print("The median is : " + 
                   findMedianSortedArrays(m, n));
    }
} 

// This code is contributed by 
// Manish Shaw(manishshaw1)

C#

// C# code for median with case of returning 
// double value when even number of elements 
// are present in both array combinely
using System;

class GFG {
    
    // Function to find max
    static int maximum(int a, int b) 
    {
        return a > b ? a : b;
    }
    
    // Function to find minimum
    static int minimum(int a, int b) 
    {
        return a < b ? a : b; 
    }
    
    // Function to find median of two sorted
    // arrays
    static double findMedianSortedArrays(ref int []a,
                           int n, ref int []b, int m)
    {
        
        int min_index = 0, max_index = n, i = 0,
        j = 0, median = 0;
        
        while (min_index <= max_index)
        {
            i = (min_index + max_index) / 2;
            j = ((n + m + 1) / 2) - i;
        
            // if i = n, it means that Elements 
            // from a[] in the second half is an 
            // empty set. and if j = 0, it means 
            // that Elements from b[] in the first
            // half is an empty set. so it is 
            // necessary to check that, because we
            // compare elements from these two 
            // groups. Searching on right
            if (i < n && j > 0 && b[j - 1] > a[i])     
                min_index = i + 1;
                    
            // if i = 0, it means that Elements
            // from a[] in the first half is an 
            // empty set and if j = m, it means 
            // that Elements from b[] in the second
            // half is an empty set. so it is 
            // necessary to check that, because 
            // we compare elements from these two
            // groups. searching on left
            else if (i > 0 && j < m && b[j] < a[i - 1])     
                max_index = i - 1;
    
            // we have found the desired halves.
            else
            {
                // this condition happens when we 
                // don't have any elements in the 
                // first half from a[] so we
                // returning the last element in 
                // b[] from the first half.
                if (i == 0)         
                    median = b[j - 1];         
                
                // and this condition happens when 
                // we don't have any elements in the
                // first half from b[] so we 
                // returning the last element in 
                // a[] from the first half.
                else if (j == 0)         
                    median = a[i - 1];         
                else        
                    median = maximum(a[i - 1], b[j - 1]);         
                break;
            }
        }
        
        // calculating the median.
        // If number of elements is odd 
        // there is one middle element.
        if ((n + m) % 2 == 1)
            return (double)median;
            
        // Elements from a[] in the second 
        // half is an empty set. 
        if (i == n)
            return (median+b[j]) / 2.0;
            
        // Elements from b[] in the second 
        // half is an empty set.
        if (j == m)
            return (median + a[i]) / 2.0;
        
        return (median + minimum(a[i], b[j])) / 2.0;
    }
    
    // Driver code
    static void Main()
    {
        int []a = new int[]{900};
        int []b = new int[]{ 10, 13, 14};
        int n = a.Length;
        int m = b.Length;
        
        // we need to define the smaller 
        // array as the first parameter to 
        // make sure that the time 
        // complexity will be O(log(min(n,m)))
        if (n < m)
            Console.Write("The median is : "
            + findMedianSortedArrays(ref a, n, 
                                   ref b, m));
        else
            Console.Write("The median is : "
            + findMedianSortedArrays(ref b, m, 
                                   ref a, n));
    }
}

// This code is contributed by Manish Shaw
// (manishshaw1)

PHP

<?php
// PHP code for median with  
// case of returning double
// value when even number 
// of elements are present
// in both array combinely
$median = 0;
$i = 0; $j = 0;

// Function to find max
function maximum($a, $b) 
{
    return $a > $b ? $a : $b;
}

// Function to find minimum
function minimum($a, $b) 
{
    return $a < $b ? $a : $b; 
}

// Function to find median
// of two sorted arrays
function findMedianSortedArrays(&$a, $n, 
                                &$b, $m)
{
    global $median, $i, $j;
    $min_index = 0; 
    $max_index = $n; 
    
    while ($min_index <= $max_index)
    {
        $i = intval(($min_index + 
                     $max_index) / 2);
        $j = intval((($n + $m + 1) / 
                           2) - $i);
    
        // if i = n, it means that 
        // Elements from a[] in the
        // second half is an empty 
        // set. and if j = 0, it 
        // means that Elements from 
        // b[] in the first half is 
        // an empty set. so it is 
        // necessary to check that, 
        // because we compare elements 
        // from these two groups. 
        // Searching on right
        if ($i < $n && $j > 0 && 
            $b[$j - 1] > $a[$i])     
            $min_index = $i + 1;
                
        // if i = 0, it means that 
        // Elements from a[] in the
        // first half is an empty 
        // set and if j = m, it means
        // that Elements from b[] in 
        // the second half is an empty 
        // set. so it is necessary to
        // check that, because we compare 
        // elements from these two groups.
        // searching on left
        else if ($i > 0 && $j < $m && 
                 $b[$j] < $a[$i - 1])     
            $max_index = $i - 1;
        
        // we have found the
        // desired halves.
        else
        {
            // this condition happens when 
            // we don't have any elements 
            // in the first half from a[] 
            // so we returning the last
            // element in b[] from the 
            // first half.
            if ($i == 0) 
                $median = $b[$j - 1];
                
            // and this condition happens 
            // when we don't have any 
            // elements in the first half 
            // from b[] so we returning the 
            // last element in a[] from the 
            // first half.
            else if ($j == 0)         
                $median = $a[$i - 1];         
            else        
                $median = maximum($a[$i - 1], 
                                  $b[$j - 1]); 
            break;
        }
    }
    
    // calculating the median.
    // If number of elements 
    // is odd there is 
    // one middle element.

    if (($n + $m) % 2 == 1)
        return $median;

    // Elements from a[] in the 
    // second half is an empty set. 
    if ($i == $n)
        return (($median + 
                 $b[$j]) / 2.0);

    // Elements from b[] in the 
    // second half is an empty set.
    if ($j == $m)
        return (($median + 
                 $a[$i]) / 2.0);
    
    return (($median + 
             minimum($a[$i], 
                     $b[$j])) / 2.0);
}

// Driver code
$a = array(900);
$b = array(10, 13, 14);
$n = count($a);
$m = count($b);

// we need to define the 
// smaller array as the 
// first parameter to make 
// sure that the time complexity
// will be O(log(min(n,m)))
if ($n < $m)
    echo ("The median is : " . 
           findMedianSortedArrays($a, $n, 
                                  $b, $m));
else
    echo ("The median is : " . 
           findMedianSortedArrays($b, $m, 
                                  $a, $n));
                                  
// This code is contributed 
// by Manish Shaw(manishshaw1)
?>


Output:

The median is : 13.5

Another Approach : Same program, but returns the median that exist in the merged array((n + m) / 2 – 1 position):

C++

// CPP code for finding median of the given two
// sorted arrays that exists in the merged array ((n+m) / 2 - 1 position)
#include<bits/stdc++.h>
using std::cout;

int maximum(int a, int b);

// Function to find median of given two sorted arrays
int findMedianSortedArrays(int *a, int n, 
                           int *b, int m)
{
    
    int min_index = 0, max_index = n, i, j;
    
    while (min_index <= max_index)
    {
        i = (min_index + max_index) / 2;
        j = ((n + m + 1) / 2) - i;
    
        // if i = n, it means that Elements from a[] in
        // the second half is an empty set. If j = 0, it
        // means that Elements from b[] in the first half
        // is an empty set. so it is necessary to check that,
        // because we compare elements from these two groups.
        // searching on right
        if (i < n && j > 0 && b[j - 1] > a[i])        
            min_index = i + 1;        
        
        // if i = 0, it means that Elements from a[] in the
        // first half is an empty set and if j = m, it means
        // that Elements from b[] in the second half is an
        // empty set. so it is necessary to check that, 
        // because we compare elements from these two groups.
        // searching on left
        else if (i > 0 && j < m && b[j] < a[i - 1])        
            max_index = i - 1;        
        
        // we have found the desired halves.
        else
        {
            // this condition happens when we don't have
            // any elements in the first half from a[] so
            // we returning the last element in b[] from
            // the first half.
            if (i == 0)            
                return b[j - 1];            
            
            // and this condition happens when we don't have any 
            // elements in the first half from b[] so we 
            // returning the last element in a[] from the first half.
            if (j == 0)            
                return a[i - 1];            
            else            
                return maximum(a[i - 1], b[j - 1]);           
        }
    }
    
    cout << "ERROR!!! " << "returning\n";    
    return 0;
}

// Function to find maximum
int maximum(int a, int b) 
{
    return a > b ? a : b; 
}

// Driver code
int main()
{
    int a[] = {900};
    int b[] = { 10,13,14};
    int n = sizeof(a) / sizeof(int);
    int m = sizeof(b) / sizeof(int);
    
    // we need to define the smaller array as the first 
    // parameter to make sure that the time complexity 
    // will be O(log(min(n,m)))
    if (n < m)
        cout << "The median is: "
             << findMedianSortedArrays(a, n, b, m);
    else
        cout << "The median is: " 
             << findMedianSortedArrays(b, m, a, n);
    return 0;
}
Output:

The median is: 13

Time Complexity : O(log(min(n, m))), where n and m are the sizes of given sorted array



My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.





Article Tags :
Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.

Recommended Posts:



4.4 Average Difficulty : 4.4/5.0
Based on 19 vote(s)






User Actions