Python statistics | pvariance()

Prerequisite : Python statistics | variance()

pvariance() function helps to calculate the variance of an entire, rather than that of a sample. The only difference between variance() and pvariance() is that while using variance(), only the sample mean is taken into consideration, while during pvariance(), the mean of entire population is taken into consideration.

Population variance is just similar to sample variance, it tells how data points in a specific population are spread out. It is the average of the distance from the data-points to the mean of the data-set, squared. The population variance is a parameter of the population and is not dependent on research methods or sampling practices.



Syntax : pvariance( [data], mu)

Parameters :
[data] : An iterable with real valued numbers.
mu (optional): Takes actual mean of data-set/ population as value.

Returnype : Returns the actual population variance of the values passed as parameter.

Exceptions :
StatisticsError is raised for data-set less than 2-values passed as parameter.
Impossible values when the value provided as mu doesn’t match actual mean of the data-set.

Code #1 :

filter_none

edit
close

play_arrow

link
brightness_4
code

# Pythom code to demonstrate the
# use of pvariance()
  
# importing statistics module
import statistics
  
# creating a random population list
population = (1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.9, 2.2,
              2.3, 2.4, 2.6, 2.9, 3.0, 3.4, 3.3, 3.2)
  
  
# Prints the population variance
print("Population variance is %s" 
      %(statistics.pvariance(population)))

chevron_right


Output :

Population variance is 0.6658984375

 
Code #2 : Demonstrates pvariance() on a different range of population trees.

filter_none

edit
close

play_arrow

link
brightness_4
code

# Python code to demonstrate pvariance()
# on various range of population sets
  
# importing statistics module
from statistics import pvariance
  
# importing fractions module as F
from fractions import Fraction as F
  
  
# Population tree for a set of positive integers
pop1 = (1, 2, 3, 5, 4, 6, 1, 2, 2, 3, 1, 3,
         7, 8, 9, 1, 1, 1, 2, 6, 7, 8, 9, )
  
# Creating a population tree for
# a set of negative integers
pop2 = (-36, -35, -34, -32, -30, -31, -33, -33, -33,
             -38, -36, -35, -34, -38, -40, -31, -32)
  
# Creating a population tree for
# a set of fractional numbers
pop3 = (F(1, 3), F(2, 4), F(2, 3),
        F(3, 2), F(2, 5), F(2, 2),
        F(1, 1), F(1, 4), F(1, 2), F(2, 1))
  
# Creating a population tree for
# a set of decimal values
pop4 = (3.45, 3.2, 2.5, 4.6, 5.66, 6.43,
        4.32, 4.23, 6.65, 7.87, 9.87, 1.23,
            1.00, 1.45, 10.12, 12.22, 19.88)
  
# Print the population variance for
# the created population trees
print("Population variance of set 1 is % s"
                        %(pvariance(pop1)))
                          
print("Population variance of set 2 is % s" 
                        %(pvariance(pop2)))
                          
print("Population variance of set 3 is % s" 
                        %(pvariance(pop3)))
                          
print("Population variance of set 4 is % s" 
                        %(pvariance(pop4)))

chevron_right


Output :

Population variance of set 1 is 7.913043478260869
Population variance of set 2 is 7.204152249134948
Population variance of set 3 is 103889/360000
Population variance of set 4 is 21.767923875432526

 
Code #3 : Demonstrates the use of mu parameter.

filter_none

edit
close

play_arrow

link
brightness_4
code

# Python code to demonstrate the use
#  of 'mu' parameter on pvariance()
  
# importing statistics module
import statistics
  
# Apparently, the Python interpreter doesn't
# even check whether the value entered for mu
# is the actual mean of data-set or not.
# Thus providing incorrect value would
# lead to impossible answers
  
# Creating a population tree of the
# age of kids in a locality
tree = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
        12, 12, 12, 12, 13, 1, 2, 12, 2, 2,
              2, 3, 4, 5, 5, 5, 5, 6, 6, 6)
  
# Finding the mean of population tree
m = statistics.mean(tree)
  
# Using the mu parameter
# while using pvariance()
print("Population Variance is % s" 
      %(statistics.pvariance(tree, mu = m)))

chevron_right


Output :


Population Variance is 14.30385015608741

 
Code #4 : Demonstrate the difference between pvariance() and variance()

filter_none

edit
close

play_arrow

link
brightness_4
code

# Pythom code to demonstrate the 
# difference between pvariance() 
# and variance()
  
# importing statistocs module
import statistics
  
# Population tree and extract
# a sample from it
tree = (1.1, 1.22, .23, .55, .67, 2.33, 2.81,
             1.54, 1.2, 0.2, 0.1, 1.22, 1.61)
  
# Sample extract from population tree
sample = (1.22, .23, .55, .67, 2.33,
               2.81, 1.54, 1.2, 0.2)
  
  
# Print sample variance and as 
# well as population variance
print ("Variance of whole popuation is %s" 
            %(statistics.pvariance(tree)))
              
print ("Variance of sample from population is %s "
                 % (statistics.variance(sample)))
  
# Print the difference in both population 
# variance and sample variance
print("\n")
  
print("Difference in Population variance"
            "and Sample variance is % s" 
        %(abs(statistics.pvariance(tree) 
        - statistics.variance(sample))))

chevron_right


Output :

Variance of the whole popuation is 0.6127751479289941
Variance of the sample from population is 0.8286277777777779 

Difference in Population variance and Sample variance is 0.21585262984878373

Note : We can see from the above sample example that Population Variance and Sample Variance doesn’t differ by a huge value.
 
Code #5 : Demonstrates StatisticsError

filter_none

edit
close

play_arrow

link
brightness_4
code

# Python code to demonstrate StatisticsError
  
# importing statistics module
import statistics
  
# creating an empty population set
pop = ()
  
# will raise StatisticsError
print(statistics.pvariance(pop))

chevron_right


Output :

Traceback (most recent call last):
  File "/home/fa112e1405f09970eeddd48214318a3c.py", line 10, in 
    print(statistics.pvariance(pop))
  File "/usr/lib/python3.5/statistics.py", line 603, in pvariance
    raise StatisticsError('pvariance requires at least one data point')
statistics.StatisticsError: pvariance requires at least one data point

 
Applications :
The applications of Population Variance is much similar to Sample Variance, although the range of population variance is much larger than sample variance. Population variance is only to be used when the variance of an entire population is to be calculated, otherwise for calculating the variance of a sample, variance() is preferred. Population Variance is a very important tool in Statistics and handling huge amounts of data. Like, when the omniscient mean is unknown (sample mean) then variance is used as biased estimator.



My Personal Notes arrow_drop_up

Its lonely at the top

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.



Improved By : nidhi_biet



Article Tags :

Be the First to upvote.


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.