# Python – Normal Distribution in Statistics

A probability distribution determines the probability of all the outcomes a random variable takes. The distribution can either be continuous or discrete distribution depending upon the values that a random variable takes. There are several types of probability distribution like Normal distribution, Uniform distribution, exponential distribution, etc. In this article, we will see about Normal distribution and we will also see how we can use Python to plot the Normal distribution.Â

## What is Normal Distribution

The normal distribution is a continuous probability distribution function also known as Gaussian distribution which is symmetric about its mean and has a bell-shaped curve. It is one of the most used probability distributions. Two parameters characterize it

• Mean(Î¼)- It represents the center of the distributionÂ
• Standard Deviation(Ïƒ) – It represents the spread in the curveÂ

The formula for Normal distribution isÂ

Normal Distribution formula

### Properties Of Normal Distribution

• Symmetric distribution – The normal distribution is symmetric about its mean point. It means the distribution is perfectly balanced toward its mean point with half of the data on either side.Â
• Bell-Shaped curve – The graph of a normal distribution takes the form bell-shaped curve with most of the points accumulated at its mean position. The shape of this curve is determined by the mean and standard deviation of the distributionÂ
• Empirical Rule – The normal distribution curve follows the empirical rule where 68% of the data lies within 1 standard deviation from the mean of the graph, 95% of the data lies within 2 standard deviations from the mean and 97% of the data lies within 3 standard deviations from the mean.

Empirical rule in Normal distributionÂ

• Additive Rule – The sum of two or more normal distributions will always be a normal distribution.Â
• Central Limit Theoram – It states if we take the mean of large no data points collected from independent and identical distributed random variables then these mean will follow a normal distribution regardless of their original distribution.

## Normal Distribution Using PythonÂ

Python programming language has several libraries which could be used to plot normal distribution and get the probability distributive function of data points.Â

### Modules Needed For Plotting and Applying Normal DistributionÂ

• Numpy â€“ A Python library that is used for numerical mathematical computation and handling multidimensional ndarray, it also has a very large collection of mathematical functions to operate on this array.
• Â Pandas â€“ A Python library built on top of NumPy for effective matrix multiplication and dataframe manipulation, it is also used for data cleaning, data merging, data reshaping, and data aggregationÂ
• Matplotlib â€“ It is used for plotting 2D and 3D visualization plots, it also supports a variety of output formats including graphsÂ
• Scipy – A Python library that is used for solving mathematical equations and algorithms. It is one most used libraries for Statistics and calculus functions.Â

We can use these modules to plot the normal distribution curve of data points. Also WeÂ

Calculating the Probability distribution of single data points using PythonÂ

## Python3

 `import` `numpy as np` `def` `normal_dist(x, mean, sd):``    ``prob_density ``=` `(np.pi``*``sd) ``*` `np.exp(``-``0.5``*``((x``-``mean)``/``sd)``*``*``2``)``    ``return` `prob_density` `mean ``=` `0``sd ``=` `1``x ``=` `1``result ``=` `normal_dist(x, mean, sd)``print``(result)`

Output:

`1.9054722647301798`

## Python3

 `import` `numpy as np``import` `matplotlib.pyplot as plt`` ` `# Mean of the distribution ``Mean ``=` `100` `# satndard deviation of the distribution``Standard_deviation  ``=` `5`` ` `# size``size ``=` `100000`` ` `# creating a normal distribution data``values ``=` `np.random.normal(Mean, Standard_deviation, size)`` ` `# plotting histograph``plt.hist(values, ``100``)``# plotting mean line``plt.axvline(values.mean(), color``=``'k'``, linestyle``=``'dashed'``, linewidth``=``2``)``plt.show()`

Output:

Normal Distribution graph

### Normal Distribution Example with Python

Suppose there are 100 students in the class and in one of the mathematics tests the average marks scored by the students in the subject is 78 and the standard deviation is 25. The marks of the student follow Normal probability distribution. We can use this information to answer some questions about the student’s marks.Â

#### Python Code for Percentage of Students who got less than 60 marks

Here we will use the norm() function from scipy.stats module to make the probability distribution for the population’s mean equal to 78 and the standard deviation equal to 25.Â

scipy.stats.norm() is a normal continuous random variable. It is inherited from the generic methods as an instance of the rv_continuous class. It completes the methods with details specific to this particular distribution.

q : lower and upper tail probability
x : quantiles
loc : Mean . Default = 0
scale : [optional]scale parameter. Default = 1
size : [tuple of ints, optional] shape or random variates.

Results : normal continuous random variable

## Python3

 `# import required libraries``from` `scipy.stats ``import` `norm``import` `numpy as np` `# Given information``mean ``=` `78``std_dev ``=` `25``total_students ``=` `100``score ``=` `60` `# Calculate z-score for 60``z_score ``=` `(score ``-` `mean) ``/` `std_dev` `# Calculate the probability of getting a score less than 60``prob ``=` `norm.cdf(z_score)` `# Calculate the percentage of students who got less than 60 marks``percent ``=` `prob ``*` `100` `# Print the result``print``(``"Percentage of students who got less than 60 marks:"``, ``round``(percent, ``2``), ``"%"``)`

Output:

`Percentage of students who got less than 60 marks: 23.58 %`

It specifies that approx 23% percent of children have scored fewer marks than 60 in mathematics.Â

#### Python Code for Percentage of Students who have scored More than 70 Â

To get the percentage of people who have scored more than 70. We first find the probability of people who have scored less than 70 then we will subtract the probability from 1 to get the Number of people who have scored more than 70.Â

## Python3

 `# import required libraries``from` `scipy.stats ``import` `norm``import` `numpy as np` `# Given information``mean ``=` `78``std_dev ``=` `25``total_students ``=` `100``score ``=` `70` `# Calculate z-score for 70``z_score ``=` `(score ``-` `mean) ``/` `std_dev` `# Calculate the probability of getting a more than 70``prob ``=` `norm.cdf(z_score)` `# Calculate the percentage of students who got more than 70 marks``percent ``=` `(``1``-``prob) ``*` `100` `# Print the result``print``("Percentage of students who got more than ``/``      ``70` `marks: ``", round(percent, 2), "` `%``")`

Output:

`Percentage of students who got more than 70 marks: 62.55 %`

## Python3

 `# import required libraries``from` `scipy.stats ``import` `norm``import` `numpy as np` `# Given information``mean ``=` `78``std_dev ``=` `25``total_students ``=` `100``min_score ``=` `75``max_score ``=` `85` `# Calculate z-score for 75``z_min_score ``=` `(min_score ``-` `mean) ``/` `std_dev``# Calculate z-score for 85``z_max_score ``=` `(max_score ``-` `mean) ``/` `std_dev`  `# Calculate the probability of getting less than 70``min_prob ``=` `norm.cdf(z_min_score)` `# Calculate the probability of getting  less than 85``max_prob ``=` `norm.cdf(z_max_score)` `percent ``=` `(max_prob``-``min_prob) ``*` `100` `# Print the result``print``(``"Percentage of students who got marks between 75 and 85 is"``, ``round``(percent, ``2``), ``"%"``)`

Output:

`Percentage of students who got marks between 75 and 85 is 15.8 %`

Previous
Next