Mathematics | Mean, Variance and Standard Deviation

Mean is average of a given set of data. Let us consider below example

 2,\ 4,\ 4,\ 4,\ 5,\ 5,\ 7,\ 9

These eight data points have the mean (average) of 5:

 \frac{2 + 4 + 4 + 4 + 5 + 5 + 7 + 9}{8} = 5.

 
Formula : \mu=\frac{\sum_{i=1}^{N} x_{i}}{N}



Where μ is mean and x1, x2, x3…., xi are elements.Also note that mean is sometimes denoted by \bar{x}



Variance is the sum of squares of differences between all numbers and means.
Deviation for above example. First, calculate the deviations of each data point from the mean, and square the result of each:

 \begin{array}{lll} (2-5)^2 = (-3)^2 = 9 && (5-5)^2 = 0^2 = 0 \\ (4-5)^2 = (-1)^2 = 1 && (5-5)^2 = 0^2 = 0 \\ (4-5)^2 = (-1)^2 = 1 && (7-5)^2 = 2^2 = 4 \\ (4-5)^2 = (-1)^2 = 1 && (9-5)^2 = 4^2 = 16. \\ \end{array}

variance = \frac{9 + 1 + 1 + 1 + 0 + 0 + 4 + 16}{8} = 4.

Formula:  \sigma^{2}= \frac { \sum_{i=1}^{N} (x_{i}-\mu)^{2}}{N}

Where μ is Mean, N is the total number of elements or frequency of distribution.

 
Standard Deviation is square root of variance. It is a measure of the extent to which data varies from the mean.

Standard Deviation (for above data) = \sqrt{ 4 } = 2

Why did mathematicians chose a square and then square root to find deviation, why not simply take the difference of values?
One reason is the sum of differences becomes 0 according to the definition of mean. Sum of absolute differences could be an option, but with absolute differences, it was difficult to prove many nice theorems. [Source: MIT Video Lecture at 1:19]

\textup{Coefficient of variation } =\frac{ \textup{Standard deviation}}{Mean}*100

     
    Some Interesting Facts:

  1. Value of standard deviation is 0 if all entries in input are same.
  2. If we add (or subtract) a number say 7 to all values in the input set, mean is increased (or decreased) by 7, but standard deviation doesn’t change.
  3. If we multiply all values in the input set by a number 7, both mean and standard deviation is multiplied by 7. But if we multiply all input values with a negative number say -7, mean is multiplied by -7, but the standard deviation is multiplied by 7.
  4. Standard deviation and varience is a measure which tells how spread out numbers is. While variance gives you a rough idea of spread, the standard deviation is more concrete, giving you exact distances from the mean.
  5. Mean, median and mode are the measure of central tendency of data (either grouped or ungrouped).

 
Below questions have been asked in previous year GATE exams
http://quiz.geeksforgeeks.org/gate-gate-cs-2012-question-64/

 
References:
https://en.wikipedia.org/wiki/Standard_deviation
http://staff.argyll.epsb.ca/jreed/math30p/statistics/standardDeviation.htm

Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above



My Personal Notes arrow_drop_up

Improved By : VaibhavRai3