Al Idian

A Classic Case of Simpson's Paradox


I was introduced to Simpson’s Paradox at an introductory statistics course in university. I found the concept particularly insightful, and it has stuck with me since. To illustrate Simpson’s Paradox, here is a classic example.

In 1973, University of California, Berkeley was sued for an apparent gap in the acceptance rate of men and women in their graduate school admissions (Bickel et al., 1975)

Applications Admitted
Men 8442 44%
Women 4321 35%

The number of male applicants was double the number of female applicants, and there seemed to be a large deficit in the proportion of women admitted.

Upon breaking the data down into individual departments however, a different pattern seemed to arise.

Department Men Applied Men Admitted Women Applied Women Admitted
A 825 62% 108 82%
B 560 63% 25 68%
C 325 37% 593 34%
D 417 33% 375 35%
E 191 28% 393 24%
F 373 6% 341 7%

The breakdown reveals that the proportion of admitted applicants was actually very close to equal — in fact, Bickel et al. claim that the data even showed a “small but statistically significant bias in favor of women.”

The reason for the discrepancy: female applicants seemed to apply disproportionately to faculties with low admission rates (e.g. Arts and Humanities). On the other hand, more men applied to faculties with higher admission rates (e.g. Science and Engineering).

This disappearance (or reversal) of a trend upon combining or breaking apart groups is known as Simpson’s Paradox.

While Simpson’s Paradox has been studied for decades and has essentially been solved (see this answered question from Stack Exchange), the public can many times seem defenseless against the unethical manipulation of data using statistics to spread misinformation and advance various agenda.

Just another reminder to maintain a healthy level of skepticism and unbelief, especially when faced with extraordinary claims.