Mann and Whitney U test

Last Updated : 26 Nov, 2020

Mann and Whitney’s U-test or Wilcoxon rank-sum test is the non-parametric statistic hypothesis test that is used to analyze the difference between two independent samples of ordinal data. In this test, we have provided two randomly drawn samples and we have to verify whether these two samples is from the same population.

The assumption for Mann-Whitney U test:

All observations of both groups are independent of each other.
The values of the dependent variable should be in an ordinal manner (means they can be compared to each other and ranked in order of highest to lowest).
The independent variable should be two independent, categorical groups.
For each of the sample recommended number is between 5 and 20.
The null hypothesis in Mann-Whitney U-test is always the same i.e. there is no significant difference between the two samples.
Mann Whitney test is applied to two distribution that need not be normally distributed but should have the same curve shape. For Example: If one curve (of a sample) has longer right-tailed, the other curve (or other samples) should also have a longer right tail.

The advantage of using the Mann-Whitney U test is that it has no effect because of the outliers as it considers the median instead of the mean for the test.

Steps for Performing the Mann Whitney U test:

Collect two samples and sample 1 and sample 2.
Take the first observation from sample 1 and compare it with observations in sample 2. Count the number of observations in Sample 2 that are smaller than that and equal to it. For, example, 10 observations in sample 2 are smaller than the first observation in sample 1 and 2 equal then out U statistics for this sample: 10 + 2(1/2) = 11
Repeat Step 2 for all observations in sample 1
Add up all of your totals from Steps 2 and 3. This isour rank sum.
Now, we calculatethe U statistics using following formula

$U_1 = n_{1}n_{2} +\frac{n_{1}\left ( n_{1}+1 \right )}{2} - R_{1}$ $U_2 = n_{1}n_{2} +\frac{n_{2}\left ( n_{2}+1 \right )}{2} - R_{2}$

where:
- n₁: number of samples in sample 1
- n₂: number of samples in sample 2
- R₁: Rank sum of sample 1
- R₂: Rank sum of sample 2
Now, our test statistic (U) will be smaller of U₁ and U₂.
Now, we look to the critical values in the table with respect to n₁ and n₂ (take it U₀).
- if U <= U₀ : we reject the null hypothesis.
- else, we do not reject the null hypothesis.

Examples:

Suppose there is a test performed on the two batches of students and the results are below:

Batch 1	Batch 2
3	9
4	7
2	5
6	10
2	8
5	6

Here, our null hypothesis will be
- H₀: There is no significant difference between batches.
- H_A: There is a significant difference between batches.
Here, our level of significance is 0.05
Now, we rank the samples according to batches, if two samples have same rank then we will average the rank

Batch 1	Rank (Batch 1)	Batch 2	Rank (Batch 2)
2	1.5	5	5.5
2	1.5	6	7.5
3	3	7	9
4	4	8	10
5	5.5	9	11
6	7.5	10	12
Rank Sum	23	Rank Sum	55

Now, we calculate the U-statistics:

$U_1 = 6*6 + 7*6/2 -23 = 34$ [Tex]U_2 = 6*6 +6*7/2 -55 = 2[/Tex]

So, our test statistics U = min ( U₁, U₂) = min (34,2) =2.
Now, we look into the U-statistics table for n₁ = 6 and n₂ = 6 and level of significance for table below. Here, our critical value is:

Mann-Whitney two tailed test

$U_0 = 5$

Here U < U₀, then we reject the null hypothesis.

Implementation:

# code for Mann-Whitney U test 
from scipy.stats import mannwhitneyu 
# Take batch 1 and batch 2 data as per above example 
batch_1 =[3, 4, 2, 6, 2, 5] 
batch_2 =[9, 7, 5, 10, 8, 6] 
  
# perform mann whitney test 
stat, p_value = mannwhitneyu(batch_1, batch_2) 
print('Statistics=%.2f, p=%.2f' % (stat, p_value)) 
# Level of significance 
alpha = 0.05
# conclusion 
if p_value < alpha: 
    print('Reject Null Hypothesis (Significant difference between two samples)') 
else: 
    print('Do not Reject Null Hypothesis (No significant difference between two samples)')

Output:

Statistics=2.00, p=0.01
Reject Null Hypothesis (Significant difference between two samples)

Suggest improvement

Difference between Null and Alternate Hypothesis

Wilcoxon Signed Rank Test

Share your thoughts in the comments

Linear Algebra and Matrix

Statistics for Machine Learning

Probability and Probability Distributions

Calculus for Machine Learning

Regression in Machine Learning

Mann and Whitney U test

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?