Skip to content
Related Articles

Related Articles

Measure of Dispersion

Improve Article
Save Article
Like Article
  • Last Updated : 23 May, 2022

This age is called the age of data, data is generated almost everywhere and all the systems right now are flooded with data. There are lots of techniques available that present to summarize and analyze the data. Mean is one of the important statistics that are used to summarize the center of the data. This measure is not enough to give an idea about the whole data, it might be possible that data is scattered, and the mean is not enough to express that. Thus, some other measures are used which are termed measures of dispersion. These measures allow us to measure the scatter in the data. Let’s look at these measures in detail. 

Measures of Dispersion

Measures of dispersion measure the scatter of the data, that is how far the values in the distribution are. These measures capture the variation between different values of the data. Intuitively, dispersion is the measure of the extent to which the points of the distribution differ from the average of the distribution. Measures of dispersion can be classified into two categories shown below:  

  1. Absolute Measures of Dispersion
  2. Relative Measures of Dispersion

Absolute Measures of Dispersion  

These measures of dispersion are measured and expressed in the units of data themselves. For example – Meters, Dollars, Kg, etc. Some absolute measures of dispersion are: 

  1. Range: It is defined as the difference between the largest and the smallest value in the distribution.
  2. Quartile Deviation: Quartile Deviation is defined as half of the distance between the first quartile ( Q1)  and the third quartile (Q3). It is also known as Semi Interquartile Range. 
  3. Mean Deviation: This is the arithmetic mean of the difference between the values and their mean.
  4. Standard Deviation: This is the square root of the arithmetic average of the square of the deviations measured from the mean.
  5. Variance: In  real variance is just square of standard deviation. 

Range

The range is the difference between the largest and the smallest values in the distribution. Thus, it can be written as R = L – S where L stands for the largest value in the distribution and S stands for the smallest value in the distribution. Higher the value of range implies higher variation. One drawback of this measure is that it only takes into account the maximum and the minimum value which might not always be the proper indicator of how the values of the distribution are scattered. 

For example, 

10, 20, 15, 0, 100 

The smallest value S in the data = 0, the largest value L in the data = 100 

R = 100 – 0 = 100

Note: Range cannot be calculated for the open-ended frequency distributions. Open-ended frequency distributions are those distributions in which either the lower limit of the lowest class or the higher limit of the highest class is not defined. 

Range for ungrouped data:

Question 1: Find out the range for the following observations. 

20, 24, 31, 17, 45, 39, 51, 61

Solution:

The largest value in the given observations is 61 and the smallest value is 17. The Range is 61 – 17 = 44

Range for grouped data:

Question 2: Find out the range for the following frequency distribution table for the marks scored by class 10 students. 

Marks IntervalsNumber of Students
0-105
10-208
20-3015
30-409

Solution:

For the largest value - Take higher limit of the highest class = 40 
For the smallest value - Take lower limit of the lowest class = 0
Range = 40 – 0 
Range = 40 

Mean Deviation

Range as a measure of dispersion only depends on the highest and the lowest values in the data. Mean deviation on other hand measures the deviation of the observations from the mean of the distribution. Since the average is the central value of the data, some deviation might be positive and some might be negative. If they are added like that, their sum will not reveal much as they tend to cancel each other’s effect. For example, 

Consider the data given below, 

-5, 10, 25

The mean of this data = 10

Now deviation from the mean for different values is (-5 -10), (10 – 10), (25 – 10) i.e -15, 0, 15 

Now adding the deviations, shows that there is zero deviation from the mean which is incorrect. Thus, to counter this problem only the absolute values of the difference are taken while calculating the mean deviation.

So, Mean Deviation (MD) = \frac{|(x_1 - \mu)| + |(x_2 - \mu)| + ....|(x_n - \mu)|}{n}

Mean deviation from the mean for Ungrouped data:

For calculating the mean deviation for ungrouped data, the following steps must be followed: 

  1. Calculate the arithmetic mean for all the values of the dataset.
  2. Calculate the difference between each value of the dataset and the mean. Only absolute values of the differences will be considered. |d|
  3. Calculate the arithmetic mean of these deviations.

M.D = \frac{\sum|d|}{n}

Question 3: Calculate the mean deviation for the given ungrouped data: 

2, 4, 6, 8, 10

Solution: 

Following the steps mentioned above, 

Mean = \mu = \frac{2 + 4 + 6 + 8 + 10}{5}

⇒ \mu = 6

M. D = \frac{\sum|d|}{n}

⇒ M.D = \frac{|(2 - 6)| + |(4 - 6)| + |(6 - 6)| + |(8 - 6)| + |(10 - 6)|}{5}

⇒ M.D = \frac{4 + 2 + 0 + 2 + 4}{5}

⇒M.D = \frac{12}{5}

⇒ M.D = 2.4

Mean Deviation from the median for Ungrouped Data:

For calculating the mean deviation for ungrouped data, the following steps must be followed: 

  1. Calculate the median of all the values of the dataset.
  2. Calculate the difference between each value of the dataset and the median. Only absolute values of the differences will be considered. |d|
  3. Calculate the arithmetic mean of these deviations.

Question 4: Calculate the mean deviation from the median for the given ungrouped data: 

2, 4, 6, 8, 10

Solution: 

Following the steps mentioned above, 

Median of this is also 6. 

M. D = \frac{\sum|d|}{n}

⇒ M.D = \frac{|(2 - 6)| + |(4 - 6)| + |(6 - 6)| + |(8 - 6)| + |(10 - 6)|}{5}

⇒ M.D = \frac{4 + 2 + 0 + 2 + 4}{5}

⇒M.D = \frac{12}{5}

⇒ M.D = 2.4

Mean deviation from mean for continuous frequency distribution:

For calculating the mean deviation for ungrouped data, the following steps must be followed: 

  1. Calculate the arithmetic mean for all the values of the dataset.
  2. Calculate the difference between the middle value of the class interval and the mean. Only absolute values of the differences will be considered. |d|
  3. Multiply |d| with their corresponding group frequencies.
  4. Calculate the arithmetic mean of these deviations.

M.D = \frac{\sum f|d|}{n}

Question 5: Calculate the mean deviation for the given data: 

Class Interval Frequency
0-104
10-202
20-304
30-400

Solution: 

Following the steps mentioned above, 

Mean = \mu = \frac{4(5) + 2(15) + 4(25) + 0(35) }{5} = \frac{20 + 30 + 100 + 0}{} = \frac{150}{10} = 15

⇒ \mu = 15

M. D = \frac{\sum|d|}{n}

⇒ M.D = \frac{4|(5 - 15)| + 2|(15 - 15)| + 4|(25 - 15)| + 0|(35 - 15)| }{10}

⇒ M.D = \frac{40 + 0 + 40 }{10}

⇒M.D = \frac{80}{10}

⇒ M.D = 8

Mean deviation from the median for continuous frequency distribution:

For calculating the mean deviation for ungrouped data, the following steps must be followed: 

  1. Calculate the median for all the values of the dataset.
  2. Calculate the difference between the middle value of the class interval and median. Only absolute values of the differences will be considered. |d|
  3. Multiply |d| with their corresponding group frequencies.
  4. Calculate the arithmetic mean of these deviations.

M.D = 

*** QuickLaTeX cannot compile formula:
 

*** Error message:
Error: Nothing to show, formula is empty

Question 6: Calculate the mean deviation for the given data: 

Class Interval Frequency
0-107
10-201
20-303
30-400

Solution: 

Following the steps mentioned above, 

Median lies in the interval (0-10) so, let’s say 5 is the median.

M. D = \frac{\sum|d|}{n}

⇒ M.D = \frac{7|(5 - 5)| + 1|(15 - 5)| + 3|(25 - 15)| + 0|(35 - 15)| }{10}

⇒ M.D = \frac{10 + 0 + 30 }{10}

⇒M.D = \frac{40}{10}

⇒ M.D = 4

Quartile Deviation 

The quartile concept is used to divide the data into four parts. It is the same as median where it divides the given data into two equal parts. This quartile concept comes under the subject of statistics which is a study of the collection of data analyzing it, interpreting, presenting organized data. Quartile Deviation is defined as half of the distance between the first quartile and the third quartile. It is also known as Semi Interquartile Range. 

 Quartile Deviation : \frac{Q3-Q1}{2}

Question 7: Find the first, third Quartile and Quartile Deviation  for the data 8, 5,15,  20, 18, 30,  40, 25

Solution:
Step 1:  Sort the given data in the ascending order
                 5, 8, 15, 18, 20, 25, 30, 40.
Step 2:  Find all Quartiles step by step
         Quartile-1 =(\frac{(n + 1){4})th term
Here     n = 8 because there are total 8 numbers in the given data.
         First Quartile = ( \frac{8 + 1}{4})th term
                       = (\frac{9}{4})th term
                       = 2.25th term
               2.25th  = 2nd term + (0.25)(3rd term - 2nd term )
                       = 8+(0.25)(15-8) = 9.75
 The First Quartile value is 9.75
Quartile-3 = (\frac{3(n + 1)}{4})th term
         Third Quartile = (\frac{3(8 + 1)}{4})th term
                        = (\frac{27}{4})th term
                        = 6.75th term
              6.75th  = 6th term +(0.75)(7th -6th)
              = 25+ (0.75)(5)= 28.75
So the third Quartile value is 28.75
Quaritle Deviation:  \frac{Q3-Q1}{2}
                    = \frac{28.75-9.75}{2}
                    = 9.5

Standard Deviation 

The most important and the most powerful measure of the dispersion is the standard deviation ; generally denoted by σ . It is computed as the square root of the mean of the squares of the differences of the variarte value from their mean. 

1.   Actual mean method

    S.D. = √\frac{(∑x-μ)2}{n}
2.   Assumed mean method

  • Short-cut method

S.D. = √(\frac{∑fd2}{∑f} -(\frac{∑fd}{∑f})2)

  • step-deviation method 

S.D. = h√( ∑{\frac{∑fd2}{∑f} – ( \frac{∑fd}{∑f})2

Question 8: Calculate the mean and standard deviation for the following: 

   

Size of items:6789101112
Frequency:36913854

Solution:  The calculations are as fellows: 

        

Size of items x Frequency fDeviation d = x – 9 fdfd2
63-3-927
76-2-1224
89-1-99
a = 913000
108188
11521020
12431236
 ∑f = 48 ∑fd= 0∑fd2=124

mean : a + \frac{∑fd}{∑f}  = 9+ 0 = 9

mean = 9

Standerd deviation : √[\frac{∑fd2} {N} – (\frac{∑fd}{N})2

                              = √\frac{124}{48}  = 1.607

Variance 

Variance refers to a statistical measurement of the spread between numbers in a data set .   Variance is often depicted by this symbol: σ2.  variance is calculated by doing square of standard deviation .

Question 9: Refer  question no . 8 , find variance .

solution:   variance (σ2) = (standard deviation )2

                                                    = (1.607)2 = 2.582

Relative Measures of Dispersion

These measures of deviation are expressed in the form of ratios, percentages. For example – Standard Deviation divided by the mean is an example of a relative measure. These measures are always dimensionless and are also known as the coefficient of dispersion. These measures come in handy while comparing the variation of two datasets that have different units. For example, consider two datasets of weights of students. In one dataset, the weight is measured in Kilograms, and in another one, it is measured in grams. Both will have equivalent variation in the values but since the units are different, absolute measures of dispersion will give a very high value for the dispersion in the dataset with weights in grams. Since absolute measures of dispersion are not appropriate in these cases, the relative measures of dispersion are used. 

  1. Coefficient of Range
  2. Coefficient of Quartile
  3. Coefficient of Mean Deviation 
  4. Coefficient of Standard Deviation
  5. Coefficient of Variance

Coefficient of Range \frac{L-S}{L+S} (100)

Question 10:   Find out the range for the following observations.

                        20, 24, 31, 17, 45, 39, 51, 61

solution:   Firstly arrange all obersation in ascending  order. 

                     17, 20, 24, 31, 39 , 45 , 51 ,61 

       Coefficient of Range:  \frac{L-S}{L+S} (100)

                                      = \frac{61-17 }{61+17}(100)

                                       = 54.64 %

Coefficient of Quartile Deviation : \frac{Q3-Q1}{Q3+Q1}  (100) 

here,    Q1 = First Quartile / lower Quartile 

           Q3 =  Third Quartile  / upper Quartile

Question11 :   Find the coefficient of  Quartile Deviation  for the data 8, 5,15,  20, 18, 30,  40, 25

solution:  In Question no. 8 we have already discussed this question till first and third Quartile ,
            Now we just have to put values in formula.
            Coefficient of Quartile Deviation : \frac{Q3-Q1}{Q3+Q1}  (100) 
                                  = \frac{28.75-9.75}{28.75+9.75} (100)
                                  = 24.67%

Coefficient of Standard Deviation : \frac{σ}{μ}

here, σ = Standard deviation of the series

         μ = Mean of the series

Question 12:   Refers question 8 , find coefficient of standard deviation : 

Solution:  
             Mean = 9 
             S.D. = 1.607
             Coefficient of  standard deviation : \frac{σ}{μ}
                = \frac{1.607}{9}
                = 0.1188

Coefficient of variance:  \frac{σ}{μ} (100) 

Question13: Refers question 8  , find coefficient of variance: 

Solution:   Mean = 9 
             S.D. = 1.607
             Coefficient of  variance : \frac{σ}{μ} (100)
                = \frac{1.607}{9} (100)
                = 11.88%

coefficient of mean deviation : Coefficient of mean deviation calculated by following ways: 

  1. Coefficient of mean deviation  ( about mean ) :  \frac{ mean deviation from mean} {mean} 
  2. Coefficient of mean deviation (about median) : \frac{ mean deviation from median}{median}
  3. Coefficient of mean deviation ( about mode) : \frac{ mean deviation form mode}{mode}

Lorenz Curve

The Lorenz curve is an important part of economics. It is a representation of the distribution of wealth and income. It was developed by Max.O. Lorenz to represent the inequality of wealth distribution. The figure below shows a typical Lorenz curve. The area enclosed between the straight line and the curved line is called the Gini coefficient. The further away the curved line is from the straight line, the more inequality in the wealth is indicated. 

This curve is used in a lot of fields such as ecology, studies of biodiversity, and business modeling. 

Gini Coefficient: It is defined as the representation scalar measurement of inequality. 

Sample Problems

Question 1: Find out the range for the following observations. 

20, 42, 13, 71, 54, 93, 15, 16

Solution:

The largest value in the given observations is 71 and the smallest value is 13. The Range is 71 – 13 = 58

Question 2: Find out the range for the following frequency distribution table for the marks scored by class 10 students. 

Marks IntervalsNumber of Students
10-208
20-3025
30-409

Solution:

For the largest value – Take higher limit of the highest class = 40 

For the smallest value – Take lower limit of the lowest class = 10

Range = 40 – 10

Range = 30

Question 3: Calculate the mean deviation for the given ungrouped data: 

-5, -4, 0, 4, 5

Solution: 

Following the steps mentioned above, 

Mean = \mu = \frac{-5 + -4 + 0 + 4 + 5}{5}

⇒ \mu = 0

M. D = \frac{\sum|d|}{n}

⇒ M.D = \frac{|(-5 - 0)| + |(-4 - 0)| + |(0 - 0)| + |(4 - 0)| + |(5 - 0)|}{5}

⇒ M.D = \frac{5 + 4 + 0 + 4 + 5}{5}

⇒M.D = \frac{18}{5}

⇒ M.D = 3.6

Question 4: Calculate the mean deviation for the given data: 

Class Interval Frequency
0-101
10-201
20-308
30-400

Solution: 

Following the steps mentioned above, 

Median lies in the interval (20-30) so, let’s say 25 is the median. 

M. D = \frac{\sum|d|}{n}

⇒ M.D = \frac{1|(5 - 25)| + 1|(15 - 25)| + 8|(25 - 25)| }{10}

⇒ M.D = \frac{20 + 10 + 0 }{10}

⇒M.D = \frac{30}{10}

⇒ M.D = 3


My Personal Notes arrow_drop_up
Recommended Articles
Page :

Start Your Coding Journey Now!