# Probability and Statistics | Simpson’s Paradox (UC Berkeley’s Lawsuit)

**Simpson’s Paradox** in layman’s term is the reversal of relationship within data with respect to the subgroups of data after combining those subgroups data.

For Example, If there are two departments in a university and both of them have a high probability of a woman getting accepted then after combining their data by intuition overall woman’s acceptance probability should be high but this may not be true.

**Mathematically**

Given, a1/b1 < c1/d1 and a2/b2 < c2/d2 then (a1+a2)/(b1+b2) < (c1+c2)/(d1+d2)?

Simpson’s Paradox says it may not be true.

7/8 < 2/2 and 1/2 < 5/8 yet, (7+1)/(2+2) > (2+5)/(2+8)

A similar case was seen in the lawsuit against UC Berkeley’s regarding the admissions data showing that men were having a higher probability of getting applications accepted than the woman’s application. But after examining the individual departments a reverse scenario came into consideration as maximum of the departments were favoring women over men.

Applicants | Admitted | |
---|---|---|

Men | 8442 | 44% |

Women | 4321 | 35% |

Departments | Men | Women | ||
---|---|---|---|---|

Applicants | Admitted | Applicants | Admitted | |

A | 825 | 62% | 108 | 82% |

B | 560 | 63% | 25 | 68% |

C | 325 | 37% | 593 | 34% |

D | 417 | 33% | 375 | 35% |

E | 191 | 28% | 393 | 24% |

F | 272 | 6% | 341 | 7% |

Why was this happening **?**

**Reason: **

This kind of behavior was seen because more women were applying to competitive departments with low rates of admission whereas more men were applying to less competitive departments with

high acceptance rates.

We can see from the table that 825 men have applied in comparison to 108 women in high acceptance rate department **A**. Whereas more girls are applying in departments with low rates like **F** and **E**. Which finally led to more men being accepted by the university than women.

**Another Example:**

Suppose we have a configuration as shown in figure below with two types of beans green and blue colored.

**Before Mixing:**

Probability of picking a green bean from Jar,

7/8 < 2/2 (Jar1) (Jar2) 1/2 < 5/8 (Jar3) (Jar4)

**After Mixing:**

Probability of picking a green bean from Jar

8/10>7/10Inequality(Jar1 + Jar3) (Jar2 + Jar4)

Here also we can see that initially jars 1 and 3 had a higher probability of picking green beans than Jar 2 and Jar 4 respectively, but after mixing the content of jars the relationship got reversed. After mixing, the content of Jar 2 and Jar 4 combined had a higher probability of picking green beans. This is a very simple example of Simpson’s Paradox.

GeeksforGeeks has prepared a complete interview preparation course with premium videos, theory, practice problems, TA support and many more features. Please refer Placement 100 for details

## Recommended Posts:

- Birthday Paradox
- Z-Score in Statistics
- Mathematics | Probability
- Probability of rain on N+1th day
- Aptitude | Probability | Question 1
- Aptitude | Probability | Question 10
- Aptitude | Probability | Question 9
- Aptitude | Probability | Question 8
- Aptitude | Probability | Question 7
- Probability that two persons will meet
- Aptitude | Probability | Question 6
- Probability of getting more value in third dice throw
- Mathematics | Conditional Probability
- Aptitude | Probability | Question 1
- Aptitude | Probability | Question 2

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.