Python – 68-95-99.7 rule in Statistics
The Empirical Rule(also called the 68-95-99.7 Rule or the Three Sigma Rule) states that for any normal distribution, we have the following observations :
- 68% of the observed values lie between 1 standard deviation around the mean :
- 95% of the observed values lie between 2 standard deviations around the mean :
- 99.7% of the observed values lie between 3 standard deviation around the mean :
Below is a standard normal distribution graph with (mean = 0 and standard deviation = 1), illustrating the Empirical Rule.
We, can verify this using functions provided by Python’s SciPy module.
We can use the cdf() function of the scipy.stats.norm module to calculate the cumulative probability(area under a distribution curve).
Syntax : cdf(x, mean, SD)
- x : value up to which cumulative probability is to be calculated
- mean : mean of the distribution
- SD : standard deviation of the distribution
Below is the implementation :
Fracton of values within one SD = 0.6826894921370859 Fracton of values within two SD = 0.9544997361036416 Fracton of values within three SD = 0.9973002039367398
Hence, we see that the fraction of values are almost equal to 0.65, 0.95 and 0.997. Thus, the empirical Rule is verified.
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning – Basic Level Course