Median of two sorted arrays

Question: There are 2 sorted arrays A and B of size n each. Write an algorithm to find the median of the array obtained after merging the above 2 arrays(i.e. array of length 2n). The complexity should be O(log(n))

Median: In probability theory and statistics, a median is described as the number separating the higher half of a sample, a population, or a probability distribution, from the lower half.
The median of a finite list of numbers can be found by arranging all the numbers from lowest value to highest value and picking the middle one.

For getting the median of input array { 12, 11, 15, 10, 20 }, first sort the array. We get { 10, 11, 12, 15, 20 } after sorting. Median is the middle element of the sorted array which is 12.

There are different conventions to take median of an array with even number of elements, one can take the mean of the two middle values, or first middle value, or second middle value.

Let us see different methods to get the median of two sorted arrays of size n each. Since size of the set for which we are looking for median is even (2n), we are taking average of middle two numbers in all below solutions.

Method 1 (Simply count while Merging)
Use merge procedure of merge sort. Keep track of count while comparing elements of two arrays. If count becomes n(For 2n elements), we have reached the median. Take the average of the elements at indexes n-1 and n in the merged array. See the below implementation.

Implementation:

#include <stdio.h>

/* This function returns median of ar1[] and ar2[].
   Assumptions in this function:
   Both ar1[] and ar2[] are sorted arrays
   Both have n elements */
int getMedian(int ar1[], int ar2[], int n)
{
    int i = 0;  /* Current index of i/p array ar1[] */
    int j = 0; /* Current index of i/p array ar2[] */
    int count;
    int m1 = -1, m2 = -1;

    /* Since there are 2n elements, median will be average
     of elements at index n-1 and n in the array obtained after
     merging ar1 and ar2 */
    for (count = 0; count <= n; count++)
    {
        /*Below is to handle case where all elements of ar1[] are
          smaller than smallest(or first) element of ar2[]*/
        if (i == n)
        {
            m1 = m2;
            m2 = ar2[0];
            break;
        }

        /*Below is to handle case where all elements of ar2[] are
          smaller than smallest(or first) element of ar1[]*/
        else if (j == n)
        {
            m1 = m2;
            m2 = ar1[0];
            break;
        }

        if (ar1[i] < ar2[j])
        {
            m1 = m2;  /* Store the prev median */
            m2 = ar1[i];
            i++;
        }
        else
        {
            m1 = m2;  /* Store the prev median */
            m2 = ar2[j];
            j++;
        }
    }

    return (m1 + m2)/2;
}

/* Driver program to test above function */
int main()
{
    int ar1[] = {1, 12, 15, 26, 38};
    int ar2[] = {2, 13, 17, 30, 45};

    int n1 = sizeof(ar1)/sizeof(ar1[0]);
    int n2 = sizeof(ar2)/sizeof(ar2[0]);
    if (n1 == n2)
        printf("Median is %d", getMedian(ar1, ar2, n1));
    else
        printf("Doesn't work for arrays of unequal size");
    getchar();
    return 0;
}

Time Complexity: O(n)




Method 2 (By comparing the medians of two arrays)

This method works by first getting medians of the two sorted arrays and then comparing them.

Let ar1 and ar2 be the input arrays.

Algorithm:

1) Calculate the medians m1 and m2 of the input arrays ar1[] 
   and ar2[] respectively.
2) If m1 and m2 both are equal then we are done.
     return m1 (or m2)
3) If m1 is greater than m2, then median is present in one 
   of the below two subarrays.
    a)  From first element of ar1 to m1 (ar1[0...|_n/2_|])
    b)  From m2 to last element of ar2  (ar2[|_n/2_|...n-1])
4) If m2 is greater than m1, then median is present in one    
   of the below two subarrays.
   a)  From m1 to last element of ar1  (ar1[|_n/2_|...n-1])
   b)  From first element of ar2 to m2 (ar2[0...|_n/2_|])
5) Repeat the above process until size of both the subarrays 
   becomes 2.
6) If size of the two arrays is 2 then use below formula to get 
  the median.
    Median = (max(ar1[0], ar2[0]) + min(ar1[1], ar2[1]))/2

Example:

   ar1[] = {1, 12, 15, 26, 38}
   ar2[] = {2, 13, 17, 30, 45}

For above two arrays m1 = 15 and m2 = 17

For the above ar1[] and ar2[], m1 is smaller than m2. So median is present in one of the following two subarrays.

   [15, 26, 38] and [2, 13, 17]

Let us repeat the process for above two subarrays:

    m1 = 26 m2 = 13.

m1 is greater than m2. So the subarrays become

  [15, 26] and [13, 17]
Now size is 2, so median = (max(ar1[0], ar2[0]) + min(ar1[1], ar2[1]))/2
                       = (max(15, 13) + min(26, 17))/2 
                       = (15 + 17)/2
                       = 16

Implementation:

#include<stdio.h>

int max(int, int); /* to get maximum of two integers */
int min(int, int); /* to get minimum of two integeres */
int median(int [], int); /* to get median of a sorted array */

/* This function returns median of ar1[] and ar2[].
   Assumptions in this function:
   Both ar1[] and ar2[] are sorted arrays
   Both have n elements */
int getMedian(int ar1[], int ar2[], int n)
{
    int m1; /* For median of ar1 */
    int m2; /* For median of ar2 */

    /* return -1  for invalid input */
    if (n <= 0)
        return -1;

    if (n == 1)
        return (ar1[0] + ar2[0])/2;

    if (n == 2)
        return (max(ar1[0], ar2[0]) + min(ar1[1], ar2[1])) / 2;

    m1 = median(ar1, n); /* get the median of the first array */
    m2 = median(ar2, n); /* get the median of the second array */

    /* If medians are equal then return either m1 or m2 */
    if (m1 == m2)
        return m1;

     /* if m1 < m2 then median must exist in ar1[m1....] and ar2[....m2] */
    if (m1 < m2)
    {
        if (n % 2 == 0)
            return getMedian(ar1 + n/2 - 1, ar2, n - n/2 +1);
        else
            return getMedian(ar1 + n/2, ar2, n - n/2);
    }

    /* if m1 > m2 then median must exist in ar1[....m1] and ar2[m2...] */
    else
    {
        if (n % 2 == 0)
            return getMedian(ar2 + n/2 - 1, ar1, n - n/2 + 1);
        else
            return getMedian(ar2 + n/2, ar1, n - n/2);
    }
}

/* Function to get median of a sorted array */
int median(int arr[], int n)
{
    if (n%2 == 0)
        return (arr[n/2] + arr[n/2-1])/2;
    else
        return arr[n/2];
}

/* Driver program to test above function */
int main()
{
    int ar1[] = {1, 2, 3, 6};
    int ar2[] = {4, 6, 8, 10};
    int n1 = sizeof(ar1)/sizeof(ar1[0]);
    int n2 = sizeof(ar2)/sizeof(ar2[0]);
    if (n1 == n2)
      printf("Median is %d", getMedian(ar1, ar2, n1));
    else
     printf("Doesn't work for arrays of unequal size");

    getchar();
    return 0;
}

/* Utility functions */
int max(int x, int y)
{
    return x > y? x : y;
}

int min(int x, int y)
{
    return x > y? y : x;
}

Time Complexity: O(logn)
Algorithmic Paradigm: Divide and Conquer



Method 3 (By doing binary search for the median):
The basic idea is that if you are given two arrays ar1[] and ar2[] and know the length of each, you can check whether an element ar1[i] is the median in constant time. Suppose that the median is ar1[i]. Since the array is sorted, it is greater than exactly i values in array ar1[]. Then if it is the median, it is also greater than exactly j = n – i – 1 elements in ar2[].
It requires constant time to check if ar2[j] <= ar1[i] <= ar2[j + 1]. If ar1[i] is not the median, then depending on whether ar1[i] is greater or less than ar2[j] and ar2[j + 1], you know that ar1[i] is either greater than or less than the median. Thus you can binary search for median in O(lg n) worst-case time. For two arrays ar1 and ar2, first do binary search in ar1[]. If you reach at the end (left or right) of the first array and don't find median, start searching in the second array ar2[].

1) Get the middle element of ar1[] using array indexes left and right.  
   Let index of the middle element be i.
2) Calculate the corresponding index j of ar2[]
     j = n – i – 1 
3) If ar1[i] >= ar2[j] and ar1[i] <= ar2[j+1] then ar1[i] and ar2[j]
   are the middle elements.
     return average of ar2[j] and ar1[i]
4) If ar1[i] is greater than both ar2[j] and ar2[j+1] then 
     do binary search in left half  (i.e., arr[left ... i-1])
5) If ar1[i] is smaller than both ar2[j] and ar2[j+1] then
     do binary search in right half (i.e., arr[i+1....right])
6) If you reach at any corner of ar1[] then do binary search in ar2[]

Example:

   ar1[] = {1, 5, 7, 10, 13}
   ar2[] = {11, 15, 23, 30, 45}

Middle element of ar1[] is 7. Let us compare 7 with 23 and 30, since 7 smaller than both 23 and 30, move to right in ar1[]. Do binary search in {10, 13}, this step will pick 10. Now compare 10 with 15 and 23. Since 10 is smaller than both 15 and 23, again move to right. Only 13 is there in right side now. Since 13 is greater than 11 and smaller than 15, terminate here. We have got the median as 12 (average of 11 and 13)

Implementation:

#include<stdio.h>

int getMedianRec(int ar1[], int ar2[], int left, int right, int n);

/* This function returns median of ar1[] and ar2[].
   Assumptions in this function:
   Both ar1[] and ar2[] are sorted arrays
   Both have n elements */
int getMedian(int ar1[], int ar2[], int n)
{
    return getMedianRec(ar1, ar2, 0, n-1, n);
}

/* A recursive function to get the median of ar1[] and ar2[]
   using binary search */
int getMedianRec(int ar1[], int ar2[], int left, int right, int n)
{
    int i, j;

    /* We have reached at the end (left or right) of ar1[] */
    if (left > right)
        return getMedianRec(ar2, ar1, 0, n-1, n);

    i = (left + right)/2;
    j = n - i - 1;  /* Index of ar2[] */

    /* Recursion terminates here.*/
    if (ar1[i] > ar2[j] && (j == n-1 || ar1[i] <= ar2[j+1]))
    {
        /* ar1[i] is decided as median 2, now select the median 1
           (element just before ar1[i] in merged array) to get the
           average of both*/
        if (i == 0 || ar2[j] > ar1[i-1])
            return (ar1[i] + ar2[j])/2;
        else
            return (ar1[i] + ar1[i-1])/2;
    }

    /*Search in left half of ar1[]*/
    else if (ar1[i] > ar2[j] && j != n-1 && ar1[i] > ar2[j+1])
        return getMedianRec(ar1, ar2, left, i-1, n);

    /*Search in right half of ar1[]*/
    else /* ar1[i] is smaller than both ar2[j] and ar2[j+1]*/
        return getMedianRec(ar1, ar2, i+1, right, n);
}

/* Driver program to test above function */
int main()
{
    int ar1[] = {1, 12, 15, 26, 38};
    int ar2[] = {2, 13, 17, 30, 45};
    int n1 = sizeof(ar1)/sizeof(ar1[0]);
    int n2 = sizeof(ar2)/sizeof(ar2[0]);
    if (n1 == n2)
        printf("Median is %d", getMedian(ar1, ar2, n1));
    else
        printf("Doesn't work for arrays of unequal size");

    getchar();
    return 0;
}

Time Complexity: O(logn)
Algorithmic Paradigm: Divide and Conquer

The above solutions can be optimized for the cases when all elements of one array are smaller than all elements of other array. For example, in method 3, we can change the getMedian() function to following so that these cases can be handled in O(1) time. Thanks to nutcracker for suggesting this optimization.

/* This function returns median of ar1[] and ar2[].
   Assumptions in this function:
   Both ar1[] and ar2[] are sorted arrays
   Both have n elements */
int getMedian(int ar1[], int ar2[], int n)
{
   // If all elements of array 1 are smaller then
   // median is average of last element of ar1 and
   // first element of ar2
   if (ar1[n-1] < ar2[0])
     return (ar1[n-1]+ar2[0])/2;

   // If all elements of array 1 are smaller then
   // median is average of first element of ar1 and
   // last element of ar2
   if (ar2[n-1] < ar1[0])
     return (ar2[n-1]+ar1[0])/2;

   return getMedianRec(ar1, ar2, 0, n-1, n);
}

References:
http://en.wikipedia.org/wiki/Median

http://ocw.alfaisal.edu/NR/rdonlyres/Electrical-Engineering-and-Computer-Science/6-046JFall-2005/30C68118-E436-4FE3-8C79-6BAFBB07D935/0/ps9sol.pdf ds3etph5wn

Asked by Snehal

Please write comments if you find the above codes/algorithms incorrect, or find other ways to solve the same problem.





  • Guest

    Please explain how method 3 would work on following input:
    Array1: 1 2 3 4 5
    Array2: 1 2 4 5 6
    Median should be (3+4)/2 but through method 3, it comes out to be (2+4)/2. Am I going wrong somewhere ?

  • newCoder

    /**
    * There are 2 sorted arrays A and B of size n each. Write an algorithm to
    * find the median of the array obtained after merging the above 2
    * arrays(i.e. array of length 2n). The complexity should be O(log(n))
    *
    * @param a
    * {1,3,5,7,9}
    * @param b
    * {2,4,6,8,10}
    *
    * @return average of the 2 medians from the merged array of length 2n.
    */
    public static int findMedian(int a[], int b[]) {
    assert a.length == b.length;

    int n = a.length;

    int low1 = 0;
    int low2 = 0;

    while (n > 2) {
    if (a[low1 + n – 1] < b[low2]) {
    return (a[low1 + n – 1] + b[low2]) / 2;
    }

    if (b[low2 + n – 1] < a[low1]) {
    return (b[low2 + n – 1] + a[low1]) / 2;
    }

    int m1 = median(a, n, low1);
    int m2 = median(b, n, low2);
    if (m1 == m2) {
    return m1;
    }

    if (m1 < m2) {
    low1 = low1 + (n – 1) / 2;
    n = n / 2 + 1;
    } else {
    low2 = low2 + (n – 1) / 2;
    n = n / 2 + 1;
    }
    }

    if (n == 2) {
    return (Math.max(a[low1], b[low2]) + Math.min(a[low1 + 1],
    b[low2 + 1])) / 2;
    }

    if (n == 1) {
    return (a[low1] + b[low2]) / 2;
    }

    return -1;
    }

  • Allen

    How do we know the median of two smaller array is still the median of the two original array in method 3 ?

    • Rohit Sharma

      if distribution is uniform i.e. elements of array have merged alternatively
      .

  • james

    So what if i want to find the 5th largest number of the two array with out merging the two arrays.

  • Timothy

    What about two different sized lists?
    If you have {1,2,3} and {4,5,6,7,8,9,10} the correct median should be 5 but when you remove {1,7,8,9,10} during the recursive algorithm, the algorithm will try to find the median for {2,3} and {5,6,7,8,9,10} and return that, which is 6.

  • Guest

    How would this work on input
    Array 1: 1, 2, 7, 8
    Array 2: 3, 4, 5, 6

    Median of the two arrays is (4 + 5) / 2, but the algorithm would get rid of either 4 or 5 in first run. Or am i missing something?

    • Eric Mengqi Han

      Array 1: [1, 2, 8, 9]
      Array 2: [3, 4, 5, 6]

      m1 = 5
      m2 = 4.5
      m1 > m2

      Array 1: [8, 9]
      Array 2: [3, 4]

      • gourav pathak

        Here Array 1 would be [2,8,9] and Array 2 would be [3,4,5]…..Take a closer look at the code

    • gourav pathak

      Here repeating elements are also counted…So the median is (3+4)/2
      as merged array is {1,1,2,2,3,4,5,6,7,8} and not {1,2,3,4,5,6,7,8}(which you are probably referring to in your comment)

  • Mangat Rai

    You have taken a very big assumptions that both A & B have n elements. But if say they both have unequal elements say m,n respectively which again can be odd or even, the problem will become considerably difficult. There will be huge no. of cases to handle.

    • Son L

      It’s actually not much more difficult. 40 lines will be enough.

      • Mangat Rai

        My point is that the problem is difficult compared to the ones discuss. They are more general case and should be discussed here. By the way – can you give a brief approach to solve it? I have solved it by following way –
        ######
        1. Compare medians( compare 2 middle elements from each if even) and get the part which may hold median
        2. remove minimum even no of elements on both side. This way odd and even will maintain their property
        #######

  • Marsha Donna

    in method 2 can someone pls explain..

    if (m1 < m2)
    {
    if (n % 2 == 0)
    return getMedian(ar1 + n/2 – 1, ar2, n – n/2 +1);
    else
    return getMedian(ar1 + n/2, ar2, n – n/2);
    }

  • Marsha Donna

    in method 2 can sumone pls explain
    if (m1 < m2)
    {
    if (n % 2 == 0)
    return getMedian(ar1 + n/2 – 1, ar2, n – n/2 +1);
    else
    return getMedian(ar1 + n/2, ar2, n – n/2);
    }

    • gourav pathak

      No m1=(5+7)/2 and m2=(6+8)/2 so arr1={5,7,9,11} and arr2={2,4,6,8}….The construction of the solution is such that at each recursive call arr1 and arr2 contain same number of elements….There’s no question of having different number of elements in arr1 and arr2

  • trying

    please check method 2
    a1[]={2,10,15}
    a2[] = {4, 12, 20}.

    for the above input median of a1 = 10 and median of a2 is 12 no as the median of a2 > a1 so we will take 15 from a1 and 4 from a2. So the median is 4+15/2 = 9.5 which is incorrect instead of 10+12/2=11. please check. thanks.

    • falcon

      dude just check condition in 2nd method if a2>a1 you will take 10,15 from a1 and 4,12 from a2. now
      median = (max(a1[0], a2[0]) + min(a1[1], a2[1]))/2

  • Soumya
     
        /* Recursion terminates here.*/
        if (ar1[i] > ar2[j] && (j == n-1 || ar1[i] <= ar2[j+1]))
        {
            /*ar1[i] is decided as median 2, now select the median 1
               (element just before ar1[i] in merged array) to get the
               average of both*/
            if (ar2[j] > ar1[i-1] || i == 0)
                return (ar1[i] + ar2[j])/2;
            else
                return (ar1[i] + ar1[i-1])/2;
        }
     

    if (ar2[j] > ar1[i-1] || i == 0) condition may produce unexpected result if i=0. I think we should check i==0 first.

    Correct me if I am wrong.

    • GeeksforGeeks

      Thanks for pointing this out. We have updated the code.

      • indra kumar

        please check 2nd method for test case arr1={1,3,5,7,9,11},arr2={2,4,6,8,10,12}, it is giving wrong answer…..

  • Jack

    In method 3, it is written that:
    “The basic idea is that if you are given two arrays ar1[] and ar2[] and know the length of each, you can check whether an element ar1[i] is the median in constant time.”

    How is this possible in constant time?

    • GeeksforGeeks

      Please see the next lines in post.

  • abhishek08aug

    Intelligent :)

     
    /* Paste your code here (You may delete these lines if not writing code) */
     
  • shine

    method 3 is not going to work always
    a1={1,4};a2={2,3};it will give 1 and 3 as output..
    correct me if i am wrong……

     
    /* Paste your code here (You may delete these lines if not writing code) */
     
  • HLS.nirma

    Incorrect:
    Since the array is sorted, it is greater than exactly i-1 values in array ar1[].

     
    Correct:
    Since the array is sorted, it is greater than exactly "i" values in array ar1[].
     
    • HLS.nirma

      Kindly correct it in method 3 explanation.
      Thank you.

      • HLS.nirma

        One redundancy in method 3:

        else if (ar1[i] > ar2[j] && j != n-1 && ar1[i] > ar2[j+1])

        correct should be:
        else if (ar1[i] > ar2[j] && ar1[i] > ar2[j+1])

        j=n-1 is already handled in the “if” above this “else if”

        Thank you.

      • Kartik

        Thanks for pointing this out, we have corrected the explanation.

  • Ravi

    Can someone explain how to do this irrespective of length of both arrays. I mean let length of array1 be m and length of array2 be n, find median of 2 sorted arrays. we should not care if m = n or m != n.

  • Hary

    I am not very sure about this in method 2

     
    else
    {
         if (n % 2 == 0)
            return getMedian(ar2 + n/2 - 1, ar1, n - n/2 + 1);
         else
           return getMedian(ar2 + n/2, ar1, n - n/2);
    } 
     

    Do we really need to interchange the arrays. If so why?

    • Karthik
       
      /* 
      
      I think the third method does not work if you give similar arrays of odd length.
      
      say a={1,2,3}  and b={1,2,3}
      
      when we call the function for the first time, left=0, right=2
      so i =1, j=3-1-1=1
      
      if (ar1[i] > ar2[j] && (j == n-1 || ar1[i] <= ar2[j+1])) will be false since a[1]>b[1] is false since both a[1] and b[1] are 2.
      
      so we have to call the function with left=2, right =2
      then i=2, j =3-2-1=0
      
      if (ar1[i] > ar2[j] && (j == n-1 || ar1[i] <= ar2[j+1]))  will be again false as neither j==2 nor a[2](3)<b[1](2) so left will become 3 and right will be 2 
      then we will change the order of arrays and call the function again .... but both arrays are same so order does not matter and the function will be called forever which results in a seg fault.
      
      correct me if i have missed any point :)
      
      
      
      
      
      
      
       */
       
  • Smart Pointer

    @Geeksforgeeks, regarding method 2, do we really need to handle the case separately for odd & even values of n? Below is my implementation which looks simple & works for all cases( i think ).

     
    float findMed(int A[], int B[], int n)
    {
            if( n <= 0 ) return FLT_MIN;
     
            if( n == 1)
                    return (A[0] + B[0]) / 2.0;
            if( n == 2 )
                    return (max(A[0],B[0]) + min(A[1],B[1])) / 2.0;
            if( A[n-1] < B[0] )
                    return (A[n-1] + B[0]) / 2.0;
            if( B[n-1] < A[0] )
                    return (A[0] + B[n-1]) / 2.0;
     
            int medA = A[n/2];
            int medB = B[n/2];
     
            if(medA == medB)
                    return medA;
            if( medA < medB )
                    return findMed(A+n/2, B, n-n/2);
            return findMed(A, B+n/2, n-n/2);
    }
     

    Check output here: http://ideone.com/owrHC

    Please let me know if i am missing any cases to handle.

    • GeeksforGeeks

      Thanks for suggesting a simple code. The orginal code was modified to handle cases suggested by jntl. Does your code handle these cases? Please let us know.

      • Smart Pointer

        @Geeksforgeeks, a little modification in my code.
        Now, it works well for all cases. It also handles the case given by jntl. The below implementation is easy to follow.

         
        float findMed(int A[], int B[], int n)
        {
                if( n <= 0 ) return FLT_MIN;
         
                if( n == 1)
                        return (A[0] + B[0]) / 2.0;
                if( n == 2 )
                        return (max(A[0],B[0]) + min(A[1],B[1])) / 2.0;
                if( A[n-1] < B[0] )
                        return (A[n-1] + B[0]) / 2.0;
                if( B[n-1] < A[0] )
                        return (A[0] + B[n-1]) / 2.0;
         
                int medA = A[(n-1)/2];
                int medB = B[(n-1)/2];
         
                if(medA == medB)
                        return medA;
                if( medA < medB )
                        return findMed(A+(n-1)/2, B, n/2 + 1);
                return findMed(A, B+(n-1)/2, n/2 + 1);
        }
         

        Check output here: http://ideone.com/3q7O5

        • newCoder

          Yes this code works:

          I have come up with an iterative version of this :

          /**
          * There are 2 sorted arrays A and B of size n each. Write an algorithm to
          * find the median of the array obtained after merging the above 2
          * arrays(i.e. array of length 2n). The complexity should be O(log(n))
          *
          * @param a
          * {1,3,5,7,9}
          * @param b
          * {2,4,6,8,10}
          *
          * @return average of the 2 medians from the merged array of length 2n.
          */
          public static int findMedian(int a[], int b[]) {
          assert a.length == b.length;

          int n = a.length;

          int low1 = 0;
          int low2 = 0;

          while (n > 2) {
          if (a[low1 + n – 1] < b[low2]) {
          return (a[low1 + n – 1] + b[low2]) / 2;
          }

          if (b[low2 + n – 1] < a[low1]) {
          return (b[low2 + n – 1] + a[low1]) / 2;
          }

          int m1 = median(a, n, low1);
          int m2 = median(b, n, low2);
          if (m1 == m2) {
          return m1;
          }

          if (m1 < m2) {
          low1 = low1 + (n – 1) / 2;
          n = n / 2 + 1;
          } else {
          low2 = low2 + (n – 1) / 2;
          n = n / 2 + 1;
          }
          }

          if (n == 2) {
          return (Math.max(a[low1], b[low2]) + Math.min(a[low1 + 1],
          b[low2 + 1])) / 2;
          }

          if (n == 1) {
          return (a[low1] + b[low2]) / 2;
          }

          return -1;
          }

  • tutum
     
    /* 
    #include<stdio.h>
    int size=8;
    
    int bsearch(int B[],int left,int right,int num)
    {
          if(left==right){
                return right;
          }
          int mid=(left+right)/2;
          if(B[mid]<=num && B[mid+1]>num){
                return mid;
          }
          if(B[mid]<num && B[mid+1]<num){
                bsearch(B,mid,right,num);
          }else{
                bsearch(B,left,mid,num);
          }
    
    }
    
    int find_median(int A[],int B[],int left,int right)
    {
          if(left==right){
                return -1;
          }
          int mid=(left+right)/2;
          int required_index=size-mid-1;
          int indx=bsearch(B,0,size-1,A[mid]);
          printf("%d\n",indx);
          if(indx==required_index){
                return A[mid];
          }
          if(indx<required_index){
                find_median(A,B,mid+1,right);
          }else{
                find_median(A,B,left,mid);
          }
    
    }
    int main()
    {
          int flag=0;
          int A[]={2,3,5,7,8,19,23,35};
          int B[]={36,37,38,39,50,55,56,57};
          if(A[0]>B[size-1]){
                printf("%d",A[0]);
                flag=1;
          }
          if(B[0]>A[size-1]){
                printf("%d",B[0]);
                flag=1;
          }
          if(flag!=1){
                int indx;
                indx=find_median(A,B,0,size-1);
                if(indx==-1){
                      indx=find_median(B,A,0,size-1);
                      printf("%d\n",indx);
                }else{
                      printf("%d\n",indx);
                }
          }
    
    return 0;
    }
     */
     
  • adarsh

    #include
    #include
    using namespace std;
    void med(int a[],int b[],int lena,int lenb);
    int main()
    {
    int a[10],b[10],lena,lenb,i;
    cout<>lena;
    for(i=0;i>a[i];
    cout<>lenb;
    for(i=0;i>b[i];
    med(a,b,lena,lenb);
    getch();
    return 0;
    }
    void med(int a[],int b[],int lena,int lenb)
    {
    int i,j,temp,cas,mid;
    float mide;
    for(i=0;i<lenb;i++)
    for(j=0;j<lena;j++)
    {
    if(b[i]<a[j])
    {
    temp=a[j];
    a[j]=b[i];
    b[i]=temp;
    }

    }
    for(i=0;i<lenb;i++)
    for(j=i+1;jb[j])
    {
    temp=b[j];
    b[j]=b[i];
    b[i]=temp;
    }
    cout<<"\narray a is\n";

    for(i=0;i<lena;i++)
    cout<<"\t"<<a[i];
    cout<<"\narray b is\n";
    for(i=0;i<lenb;i++)
    cout<<"\t"<temp)
    mid=a[temp];
    else
    {
    temp=temp-lena;
    mid=b[temp];
    }
    case 1:if(lena>temp)
    mide=float((a[temp]+a[temp-1])/2.0);
    else
    if(lena==temp)
    mide=float((a[temp-1]+b[0])/2.0);
    else
    {
    temp=temp-lena;
    mide=float((b[temp-1]+b[temp])/2.0);
    }

    }
    if(cas==2)
    cout<<"\nthe median is "<<mid;
    else
    cout<<"\nthe median is "<<mide;

    }
    advantage: 2 array differ in size

  • Ankit Gupta

    On similar lines. O(n) solution can be simplified to :

     
    int getMedian(int s1[], int n, int s2[], int m)
    {
        int m1, m2, total = m+n;
    
        int i = 0, j = 0;
        for(int k = 0; k <= total>>1; k++) {
            m1 = m2;
            if (i == n) {
                m2 = s2[j++];
            } else if (j == m) {
                m2 = s1[i++];
            } else {
                m2 = (s1[i] < s2[j]) ? s1[i++] : s2[j++];
            }
        }
    
        return (total&1) ? m2 : (m1+m2)/2;
    }
     
  • kg
     
    #include<iostream>
    using namespace std;
    int median1(int *ar1,int n1,int *ar2,int n2)  // Time - O(n)  Space -  O(1)
    {
        int total = n1+n2;
        int count = total/2;
        int i=0;
        int el1,el2;
        int index1 = 0,index2=0;
        for(;i<=count&&index1<n1&&index2<n2;i++)
        {
            el1 = el2;
            if(ar1[index1] < ar2[index2])
                el2 = ar1[index1++];
            else
                el2 = ar2[index2++];
        }
        while(i<=count)
        {
            el1 = el2;
            if(index1<n1)
                el2 = ar1[index1++];
            else if(index2<n2)
                el2 = ar2[index2++];
            i++;
        }
    
        if(total&1)
             return el2;
        else
            return (el2+el1)/2;
    
    }
    int main()
    {
        int ar1[] =  {2,4,6,8};
        int ar2[] = {1,3,6,9};
    
    
        int n1 = sizeof(ar1)/sizeof(ar1[0]);
        int n2 = sizeof(ar2)/sizeof(ar2[0]);
        cout<<median1(ar1,n1,ar2,n2);
    }
    
     
    • kg

      This works for different size array…. & iterative process.

  • yc

    different lenght, iteration

     
    /* Paste your code here (You may delete these lines if not writing code) */
    
    #include <stdio.h>
    #include <stdlib.h>
    #include <algorithm>
    using namespace std;
    int cent(int a, int b, int c){
      //a and b must be in order from small to large
      return min(max(a,c),b);
    }
    
    
    double findmedian(int a1[],int n1, int a2[],int n2){
    
      int *p1, *p2;
      if(n1 < n2) {
        p1=a1;
        p2=a2;
      }else{
        p1=a2;
        p2=a1;
        int tmp=n2;
        n2=n1;
        n1=tmp;
      }
      
      if(n1 == 1){
        if(n2 == 1) 
          return 0.5*(p1[0]+p2[0]); 
        if(n2 == 2) 
          return cent(p2[0],p2[1],p1[0]);
        //n2 >=3 ;
        int mid1=(n2-1)/2;
        int mid2= n2/2;
        if(mid1==mid2) 
          return 0.5*(p2[mid1]+cent(p2[mid1-1],p2[mid1+1],p1[0]));
        return cent(p2[mid1],p2[mid2],p1[0]);
      }
      while(n1 > 0){
        if(n1 == 2){
          if(n2 == 2) 
    	return 0.5*(max(p1[0],p2[0]) + min(p1[1],p2[1]));
          
          int mid1=(n2-1)/2;
          int mid2= n2/2;
          
          return 0.5*( cent(p2[mid1-1],p2[mid1+1],cent(p1[0],p1[1],p2[mid1])) 
    		   + cent(p2[mid2-1],p2[mid2+1],cent(p1[0],p1[1],p2[mid2]))); 
          
        }
        
        int mid11=(n1-1)/2;
        int mid12=n1/2;
        int mid21=(n2-1)/2;
        int mid22=n2/2;
        
        if((mid11+mid12) == (mid21+mid22)) 
          return 0.5*(mid11+mid12);
        else if((mid11+mid12) < (mid21+mid22)){
          n1-=mid11;
          p1=&p1[mid11];
          n2=n2-mid11;
        }else{
          int trim=n1-mid12-1;
          n1=mid12+1;
          n2-=trim;
          p2=&p2[trim];
        }
        continue; 
      }
    }
    
    int main(int argc, char** argv){
    
      int a[]={-1,1,2,3,3,7,9,11,33};
      int b[]={2,4};
    
      int n1=sizeof(a)/sizeof(int);
      int n2=sizeof(b)/sizeof(int);
    
      printf("Median is %f\n",findmedian(a,n1,b,n2));
    
    
    }
    
    
     
  • Ajinkya

    Rascala different length arrays. Do this. Mind it.

     
    #include<stdio.h>
    #include<conio.h>
    #include<iostream.h>
    
    int max(int a,int b)
    {
        return((a>b)?a:b);
    }
    
    int min(int a,int b)
    {
        return((a<b)?a:b);
    }
    
    int median(int arr1[],int arr2[],int n1,int n2,int m1,int m2)
    {
        //Base case - Recursion end case
        if((n2-n1)<=1 && (m2-m1)<=1)
        {
           if((n2-n1)==1 && (m2-m1)==1) //2 elements in both sublists
              return(max(max(arr1[n1],arr2[m1]),min(arr1[n2],arr2[m2])));
           else if((n2-n1)==0) //1st sublist just 1 element
              return(min(arr2[m1],arr2[m2]));
           else //2nd sublist contains just 1 element
              return(min(arr1[n1],arr1[n2]));
        }
        
        //Recursion step
        int mid1=(n1+n2)/2;
        int mid2=(m1+m2)/2;
        if(arr1[mid1]<arr2[mid2])
           return(median(arr1,arr2,mid1,n2,m1,mid2));
        else if(arr2[mid2]<arr1[mid1])
           return(median(arr1,arr2,n1,mid1,mid2,m2));
        else if(arr1[mid1]==arr2[mid2])
        {
             if((n2-n1)>(m2-m1)) //more elements in arr1
                return(arr1[mid1+1]);
             else if((n2-n1)<(m2-m1)) //more elements in arr2
                return(arr2[mid2+1]);
             else
                return(arr1[mid1]);
        }
    }
    int main()
    {
        int arr1[]={4,21,34,56,78};
        int arr2[]={2,34,56,57,67,89};
        int n=sizeof(arr1)/sizeof(arr1[0]);
        int m=sizeof(arr2)/sizeof(arr2[0]);
        cout<<"\nMedian of merged arrays is: "<<median(arr1,arr2,0,n-1,0,m-1);
        getch();
        return 0;
    }
    
     
    • Jelum

      @Ajinkya:
      Your solution is wrong. Try:

      arr1:
      41 42 43 74 83
      arr2:
      3 6 10 25 53 76 78 84 95

      Your answer is 25

      Right is number between 43 and 53

  • Gautam
     
    
    #include<stdio.h>
    #include<stdlib.h>
    #define MIN -32767
    #define MAX  32767
    
    /*
      this can be used to find the median of two sorted array
    */
    
    int findK(int A[], int m, int B[], int n, int k)
    
    {
     if(m<0 || n<0 || k<0 || k >(m+n))
    	return -1;
    
      int i = (int)((double)m / (m+n) * (k-1));
      int j = (k-1) - i;
    
      // invariant: i + j = k-1
      // Note: A[-1] = -INF and A[m] = +INF to maintain invariant
      int Ai_1 = ((i == 0) ? MIN : A[i-1]);
      int Bj_1 = ((j == 0) ? MIN : B[j-1]);
      int Ai   = ((i == m) ? MAX : A[i]);
      int Bj   = ((j == n) ? MAX : B[j]);
    
      if (Bj_1 < Ai && Ai < Bj)
    	 return Ai;
      else if (Ai_1 < Bj && Bj < Ai)
    	 return Bj;
    
      // if none of the cases above, then it is either:
      if (Ai < Bj)
    	 // exclude Ai and below portion
    	 // exclude Bj and above portion
    	 return findK(A+i+1, m-i-1, B, j, k-i-1);
      else /* Bj < Ai */
    	 // exclude Ai and above portion
    	 // exclude Bj and below portion
    	 return findK(A, i, B+j+1, n-j-1, k-j-1);
    }
    int main()
    {
    int m,n;
    
    
    int a[3]= {1,3,5};
    int b[4]= {2,4,6,8};
    
    m=3;
    n=4;
    
     if((m+n)%2==0) 
      {
    	 int y1=findK(a,m,b,n,(m+n)/2);
    	 int y2=findK(a,m,b,n,((m+n)/2+1));
    	 printf("Median %f",(float)(y1+y2)/(float)(2.0));
      }
     else
      printf("Median %d",findK(a,m,b,n,(m+n)/2+1));
    return 0;
    }
    
     
  • GeeksforGeeks

    @All: Updates on this post have been in queue from a long time. Apologies for the long delay. We have updated the post now.

    @nutcracker: Thanks for suggesting the optimization. We have added a point for your suggested optimization.

    @jntl: Thanks for suggesting the fix. We have incorporated your suggested changes, method 2 is now bug free.

    @Anonymous and @spandan: We will soon be publishing another post for arrays of unequal size.

  • Anonymous

    second solution might fail for cases where arrays are not of equal size.

    • kartik

      The above solutions are only for two sorted arrays of equal size. We will soon be publishing another post for unequal size.

  • Nishant

    I think method -1 will fail in following case
    arr1 – {1,2,3,4,5,6,7,8,9,10,14,156}
    arr2 = { 2002, 2004,….}

    Can you explain if it would not fail

    • kartik

      @Nishant: You could try running the program before commenting here. Anyways, I did this for you and it worked fine. See the following program. It gives output as 1079 which is average of 156 and 2002.

       
      #include <stdio.h>
      
      /* This function returns median of ar1[] and ar2[].
         Assumptions in this function:
         Both ar1[] and ar2[] are sorted arrays
         Both have n elements */
      int getMedian(int ar1[], int ar2[], int n)
      {
        int i = 0;  /* Current index of i/p array ar1[] */
        int j = 0; /* Current index of i/p array ar2[] */
        int count;
        int m1 = -1, m2 = -1;
      
        /* Since there are 2n elements, median will be average
         of elements at index n-1 and n in the array obtained after
         merging ar1 and ar2 */
        for(count = 0; count <= n; count++)
        {
          /*Below is to handle case where all elements of ar1[] are
            smaller than smallest(or first) element of ar2[]*/
          if(i == n)
          {
            m1 = m2;
            m2 = ar2[0];
            break;
          }
      
          /*Below is to handle case where all elements of ar2[] are
            smaller than smallest(or first) element of ar1[]*/
          else if(j == n)
          {
            m1 = m2;
            m2 = ar1[0];
            break;
          }
      
          if(ar1[i] < ar2[j])
          {
            m1 = m2;  /* Store the prev median */
            m2 = ar1[i];
            i++;
          }
          else
          {
            m1 = m2;  /* Store the prev median */
            m2 = ar2[j];
            j++;
          }
        }
      
        return (m1 + m2)/2;
      }
      
      /* Driver program to test above function */
      int main()
      {
         int ar1[] = {1,2,3,4,5,6,7,8,9,10,14,156};
         int ar2[] = {2002, 2004, 2006, 2008,2010, 2012, 2014, 2016,2018, 2020, 2022, 2024};
         int n = sizeof(ar1)/sizeof(ar1[0]);
         printf("%d", getMedian(ar1, ar2, n)) ;
      
         getchar();
         return 0;
      }
       
  • Anand
  • http://codeinterview.blogspot.com/ John

    Nice code!

    Can you walk me through the following case:
    int ar1[] = {1, 3, 5, 7};
    int ar2[] = {2, 8, 10};
    It seems could be find the right median.Thank you.

  • Zero

    Hey Guys,

    Curious if this would work. Tested it on a few samples n looks fine.

    let A and B be the two sorted arrays.

    m = length(A) n=length(B)

    have two pointers ptrA and ptrB pointing to the first element of A and B respectively. if A[ptrA]<= B[ptrB], increment ptrA. else increment ptrB. stop when the number of increments is equal to (m+n)/2.

    if m+n is odd, the median is the minimum( A[ptrA] and B[ptrB] ). if even take mean of the min number and next greater number.

    Kindly let me know if this works!

    • kartik

      @Zero: This method looks same as method 1. Correct me if I am wrong.

  • ravi

    just check for ar1[0] & ar2[0]
    if ar1[0] is smaller then call getMedianRec for ar1
    else call getMedianRec for ar2.

  • nutcracker
      
    
    int getMedian(int ar1[], int ar2[], int n)
    {
    if (ar1[n-1] < ar2[0]) return (ar1[n-1]+ar2[0])/2;
    if (ar2[n-1] < ar1[0]) return (ar2[n-1]+ar1[0])/2;
    return getMedianRec(ar1, ar2, 0, n-1, n);
    } 
    • nutcracker
       
      /* this code looks more clean */
      
      int getMedian(int ar1[], int ar2[], int n)
      {
      if (ar1[n-1] < ar2[0]) return (ar1[n-1]+ar2[0])/2;
      if (ar2[n-1] < ar1[0]) return (ar2[n-1]+ar1[0])/2;
      return getMedianRec(ar1, ar2, 0, n-1, n);
      }
      
      int getMedianRec(int ar1[], int ar2[], int left, int right, int n)
      {
        int i, j;  
       
        /* We have reached at the end (left or right) of ar1[] */
        if(left > right)
          return getMedianRec(ar2, ar1, 0, n-1, n);
       
        i = (left + right)/2;
        j = n - i - 1;  /* Index of ar2[] */
        if (i==0 || j==0) return (ar1[i]+ar2[j])/2;
       
       /* Recursion terminates here.*/
        if(ar1[i] > ar2[j] && (ar1[i] <= ar2[j+1]))
        {
           return (ar1[i] + ar1[i-1])/2;
        }  
       
        /*Search in left half of ar1[]*/
        else if (ar1[i] > ar2[j] && ar1[i] > ar2[j+1])
          return getMedianRec(ar1, ar2, left, i-1, n);              
       
        /*Search in right half of ar1[]*/
        else /* ar1[i] is smaller than both ar2[j] and ar2[j+1]*/
          return getMedianRec(ar1, ar2, i+1, right, n);
      }
       
      /* Driver program to test above function */
      int main()
      {
      //  int ar1[] = {1, 12, 15, 26, 38};
      //  int ar2[] = {2, 13, 17, 30, 45};
        int ar1[] = {1,3,5,7,11};
        int ar2[] = {9,13,15,17,19};
        int ar1[] = {1,3,5,7,9};
        int ar2[] = {11,13,15,17,19};
      
        printf("%d", getMedian(ar1, ar2, 5)) ;
       
        getchar();
        return 0;
      } 
    • bala

      Yes, this check would induce another best case run of O(1). Nice !

  • nutcracker

    shouldn’t we first check if all elements of one array are smaller or greater than other array?
    int getMedian(int ar1[], int ar2[], int n)
    {
    if (a[n-1] b[n-1]) return ((b[n-1]+a[0])/2);
    return getMedianRec(ar1, ar2, 0, n-1, n);
    }

  • nutcracker

    an O(2n) algo would be

     
    int median (int a[], int b[], int n)
    {
      int i=n-1;int j=i;
      While (j>=0)
      {
        if (a[i]>b[j]) { swap(a[i],b[j]);}
          j--;
      }
      return ((a[n-1] + b[0])/2);
    }
     
  • Crime_Master_GoGo

    that was good explanation,

    I would like to know your approach when the two array have the different length.

  • spandan

    how about doing this for sorted arrays with unequal elements.

  • Rahul jain

    heyy, really a very nice article!!!

  • marius

    very nice article.
    great algorithms, elegant solutions and very good explanations!

    Thanks!

  • Abhinav Raghunandan

    Good work . Thanks for sharing your knowledge . The way your explained was simple great !!!!

  • jntl

    IMHO, the code of Method 2 should be modified as:

     
    if (m1 < m2)  
    {
        if (n % 2 == 0)    
            return getMedian(ar1 + n/2 - 1, ar2, n - n/2 +1);
        else
            return getMedian(ar1 + n/2, ar2, n - n/2);
    }
    
    if (n % 2 == 0)
        return getMedian(ar2 + n/2 - 1, ar1, n - n/2 + 1);  
    else
        return getMedian(ar2 + n/2, ar1, n - n/2);
     
  • jntl

    In Method 2, consider the following test case:
    arr1[]={2, 4, 6, 10}
    arr2[]={1, 3, 9, 12}
    Method 1 returns (4 + 6 ) / 2 = 5, which is correct.
    Method 2 returns (3 + 6) / 2 = 4, which is wrong!
    This is because Method 2 picks {6, 10} from arr1 and {1, 3} from arr2, which will lost the correct pair {4, 6}.
    We should pick {4, 6, 10} from arr1 and {1, 3, 9} from arr2 instead.
    The algorithm of Method 2 should be modified as below:
    if (m1 m2 case.

    • sourabhjakhar

      if we take array a[(n/2-1)…..],a[…….(n/2+1)]
      if n is even than it gives correct answer for this case also

  • jntl

    In the code, interger division is used, so the median of 1 and 4 is (1 + 4) / 2 = 2
    I think it is better to use float division so that the median is 2.5, which is more precise.

  • Sandeep

    @Rohini: In the above algorithms/codes, it is assumed that arrays are of equal size, but can be easily modified for the arrays of different sizes.

  • Rohini

    Does this work for 2 different sizes of the sorted array?

  • geeksforgeeks

    @rv_10987: Thanks very much for pointing out this case. We have made changes to handle it. For median of a single array arr[], we have added a function median() that returns appropriate median.

  • rv_10987

    In method 2:
    Test case: arr1[]={2,4,6,8}
    arr2[]={1,3,6,9}
    n=4
    As m1=arr1[n/2]=6 and m2=arr2[n/2]=6 so the o/p would be 6. But the median should be (4+6)/2=5.

  • TJ

    @Ved, instead of passing the reduced array, you may use the lower index and higher index that bound the reduced array.
    So, instead of just passing the Array1, Array2
    Pass low1, high1 and low2 and high2 along with Array1 and Array2.
    So, mid1 = (low1+high1)/2 and mid2=(low2+high2)/2

  • http://techpuzzl.wordpress.com/ Ved

    The method 2 (Median of Median) using recursion is very good for languages where we can pass an array with an offset.
    I am not able to translate the same logic in Java where I can not pass the reduced array for each recursive call.
    Any ideas ?

  • geeksforgeeks

    @sachin midha: Thanks very much for pointing out the bug. We have included the suggested changes to the original post.

  • sachin midha

    I didnot run the program so i dont know whether it gives correct result or not but what i thought was this :
    for the ex. that you have given with
    ar1[] = {1, 2, 3, 4, 6}
    count & i in the called function are each 0 initially
    after pass 1 : i=1 & count=1
    after pass 2 : i=2 & count=2
    after pass 3 : i=3 & count=3
    after pass 4 : i=4 & count=4
    after pass 5 : i=5 & count=5
    now since count=5 the loop will run again and will compare
    ar1[5] with ar2[0] but since ar1[5] is not a legitimate element, it is conceptually wrong.(Its a WARNING:ARRAY bounds crossed but not an error but it runs)
    In your case, you might be getting the correct answer because the random element ar1[5] was by chance greater than ar2[0].
    Although it is a very small bug but i thought to bring it up so that i may get to know if there is some problem in my evaluation.
    and this problem can be eradicated by putting an if at the start of the loop before comparison of
    ar1[i] & ar2[j].

     
    if(i==n)
    { 
       m1=m2;
       m2=ar2[0]; 
       break; 
    }
    else if(j==n)
    {
       m1=m2; 
       m2=ar1[0]; 
       break; 
    }
     

    I hope im clear enough this time.

  • geeksforgeeks

    @sachin midha: Could you please provide example arrays for which the method 1 didn’t work. We tried below for method 1 and got the correct answer.

     
    /* Driver program to test method 1 */
    int main()
    {
       int ar1[] = {1, 2, 3, 4, 6};
       int ar2[] = {10, 13, 17, 30, 45};
    
       printf("%d", getMedian(ar1, ar2, 5)) ;
    
       getchar();
       return 0;
    }
     
  • sachin midha

    In method 1, one case has not been taken care of,i.e.,
    if all the elements of one array are smaller than all the elements of the oher array.
    In this case suppose elements of arr1 are less than the first element of arr2, then when count = n index of arr1 which will be accessed, will be arr1[n] which is not an existing element, hence generating an error.
    Similarly when all arr2 elements are smaller arr2[n] would be accessed which is again an error condition.

  • geeksforgeeks

    @Minjie Zha: Thanks very much for pointing out the typo. We have corrected it.

  • Minjie Zha

    Step (5) in method 3, there is a typo. I think it should be “smaller” instead of “greater”.

  • Rachel

    I think this article made some interesting points, I read a textbook directly related to this topic, its called Probability: Theory and Examples by Richard Durrett , I found my used copy for less than the bookstores at http://www.belabooks.com/books/9780534424411.htm

  • Snehal

    Thanks a lot …The Best part of the solution is the way of explaining …