Statistics module provides very powerful tools, which can be used to compute anything related to Statistics. variance() is one such function. This function helps to calculate the variance from a sample of data (sample is a subset of populated data).
variance() function should only be used when variance of a sample needs to be calculated. There’s another function known as pvariance(), which is used to calculate the variance of an entire population.
In pure statistics, variance is the squared deviation of a variable from its mean. Basically, it measures the spread of random data in a set from its mean or median value. A low value for variance indicates that the data are clustered together and are not spread apart widely, whereas a high value would indicate that the data in the given set are much more spread apart from the average value.
Variance is an important tool in the sciences, where statistical analysis of data is common. It is the square of standard deviation of the given data-set and is also known as second central moment of a distribution. It is usually represented by
in pure Statistics.
Variance is calculated by the following formula :
It’s calculated by mean of square minus square of mean
![Rendered by QuickLaTeX.com \operatorname {Var} (X)=\operatorname {E} \left[(X-\mu )^{2}\right]](https://www.geeksforgeeks.org/wp-content/ql-cache/quicklatex.com-3711d6fd60432daeaf86a498c144051b_l3.png)
Syntax : variance( [data], xbar )
Parameters :
[data] : An iterable with real valued numbers.
xbar (Optional) : Takes actual mean of data-set as value.
Returntype : Returns the actual variance of the values passed as parameter.
Exceptions :
StatisticsError is raised for data-set less than 2-values passed as parameter.
Throws impossible values when the value provided as xbar doesn’t match actual mean of the data-set.
Code #1 :
Python3
import statistics
sample = [ 2.74 , 1.23 , 2.63 , 2.22 , 3 , 1.98 ]
print ( "Variance of sample set is % s"
% (statistics.variance(sample)))
|
Output :
Variance of sample set is 0.40924
Code #2 : Demonstrates variance() on a range of data-types
Python3
from statistics import variance
from fractions import Fraction as fr
sample1 = ( 1 , 2 , 5 , 4 , 8 , 9 , 12 )
sample2 = ( - 2 , - 4 , - 3 , - 1 , - 5 , - 6 )
sample3 = ( - 9 , - 1 , - 0 , 2 , 1 , 3 , 4 , 19 )
sample4 = (fr( 1 , 2 ), fr( 2 , 3 ), fr( 3 , 4 ),
fr( 5 , 6 ), fr( 7 , 8 ))
sample5 = ( 1.23 , 1.45 , 2.1 , 2.2 , 1.9 )
print ( "Variance of Sample1 is % s " % (variance(sample1)))
print ( "Variance of Sample2 is % s " % (variance(sample2)))
print ( "Variance of Sample3 is % s " % (variance(sample3)))
print ( "Variance of Sample4 is % s " % (variance(sample4)))
print ( "Variance of Sample5 is % s " % (variance(sample5)))
|
Output :
Variance of Sample 1 is 15.80952380952381
Variance of Sample 2 is 3.5
Variance of Sample 3 is 61.125
Variance of Sample 4 is 1/45
Variance of Sample 5 is 0.17613000000000006
Code #3 : Demonstrates the use of xbar parameter
Python3
import statistics
sample = ( 1 , 1.3 , 1.2 , 1.9 , 2.5 , 2.2 )
m = statistics.mean(sample)
print ( "Variance of Sample set is % s"
% (statistics.variance(sample, xbar = m)))
|
Output :
Variance of Sample set is 0.3656666666666667
Code #4 : Demonstrates the Error when value of xbar is not same as the mean/average value
Python3
import statistics
sample = ( 1 , 1.3 , 1.2 , 1.9 , 2.5 , 2.2 )
m = statistics.mean(sample)
print (statistics.variance(sample, xbar = - 100 ))
|
Output :
0.3656666666663053
Note : It is different in precision from the output in Code #3
Code #4 : Demonstrates StatisticsError
Python3
import statistics
sample = []
print (statistics.variance(sample))
|
Output :
Traceback (most recent call last):
File "/home/64bf6d80f158b65d2b75c894d03a7779.py", line 10, in
print(statistics.variance(sample))
File "/usr/lib/python3.5/statistics.py", line 555, in variance
raise StatisticsError('variance requires at least two data points')
statistics.StatisticsError: variance requires at least two data points
Applications :
Variance is a very important tool in Statistics and handling huge amounts of data. Like, when the omniscient mean is unknown (sample mean) then variance is used as biased estimator. Real world observations like the value of increase and decrease of all shares of a company throughout the day cannot be all sets of possible observations. As such, variance is calculated from a finite set of data, although it won’t match when calculated taking the whole population into consideration, but still it will give the user an estimate which is enough to chalk out other calculations.
Whether you're preparing for your first job interview or aiming to upskill in this ever-evolving tech landscape,
GeeksforGeeks Courses are your key to success. We provide top-quality content at affordable prices, all geared towards accelerating your growth in a time-bound manner. Join the millions we've already empowered, and we're here to do the same for you. Don't miss out -
check it out now!
Last Updated :
11 Nov, 2022
Like Article
Save Article