Python – 68-95-99.7 rule in Statistics
The Empirical Rule(also called the 68-95-99.7 Rule or the Three Sigma Rule) states that for any normal distribution, we have the following observations :
- 68% of the observed values lie between 1 standard deviation around the mean :
- 95% of the observed values lie between 2 standard deviations around the mean :
- 99.7% of the observed values lie between 3 standard deviation around the mean :
Below is a standard normal distribution graph with (mean = 0 and standard deviation = 1), illustrating the Empirical Rule.
We, can verify this using functions provided by Python’s SciPy module.
We can use the cdf() function of the scipy.stats.norm module to calculate the cumulative probability(area under a distribution curve).
Syntax : cdf(x, mean, SD)
- x : value up to which cumulative probability is to be calculated
- mean : mean of the distribution
- SD : standard deviation of the distribution
Below is the implementation :
Fraction of values within one SD = 0.6826894921370859 Fraction of values within two SD = 0.9544997361036416 Fraction of values within three SD = 0.9973002039367398
Hence, we see that the fraction of values are almost equal to 0.65, 0.95 and 0.997. Thus, the empirical Rule is verified.