# Measure of Dispersion

This age is called the age of data, data is generated almost everywhere and all the systems right now are flooded with data. There are lots of techniques available that present to summarize and analyze the data. Mean is one of the important statistics that are used to summarize the center of the data. This measure is not enough to give an idea about the whole data, it might be possible that data is scattered, and the mean is not enough to express that. Thus, some other measures are used which are termed measures of dispersion. These measures allow us to measure the scatter in the data. Let’s look at these measures in detail.

### Measures of Dispersion

Measures of dispersion measure the scatter of the data, that is how far the values in the distribution are. These measures capture the variation between different values of the data. Intuitively, dispersion is the measure of the extent to which the points of the distribution differ from the average of the distribution. Measures of dispersion can be classified into two categories shown below:

- Absolute Measures of Dispersion
- Relative Measures of Dispersion

**Absolute Measures of Dispersion **

These measures of dispersion are measured and expressed in the units of data themselves. For example – Meters, Dollars, Kg, etc. Some absolute measures of dispersion are:

**Range:**It is defined as the difference between the largest and the smallest value in the distribution.**Quartile Deviation:**Quartile Deviation is defined as half of the distance between the first quartile ( Q1) and the third quartile (Q3). It is also known as Semi Interquartile Range.**Mean Deviation:**This is the arithmetic mean of the difference between the values and their mean.**Standard Deviation:**This is the square root of the arithmetic average of the square of the deviations measured from the mean.**Variance:**In

**Range**

The range is the difference between the largest and the smallest values in the distribution. Thus, it can be written as **R = L – S** where L stands for the largest value in the distribution and S stands for the smallest value in the distribution. Higher the value of range implies higher variation. One drawback of this measure is that it only takes into account the maximum and the minimum value which might not always be the proper indicator of how the values of the distribution are scattered.

For example,

10, 20, 15, 0, 100

The smallest value S in the data = 0, the largest value L in the data = 100

R = 100 – 0 = 100

**Note: **Range cannot be calculated for the open-ended frequency distributions. Open-ended frequency distributions are those distributions in which either the lower limit of the lowest class or the higher limit of the highest class is not defined.

**Range for ungrouped data:**

**Question 1: Find out the range for the following observations. **

**20, 24, 31, 17, 45, 39, 51, 61**

**Solution:**

The largest value in the given observations is 61 and the smallest value is 17. The Range is 61 – 17 = 44

**Range for grouped data:**

**Question 2: Find out the range for the following frequency distribution table for the marks scored by class 10 students. **

Marks Intervals | Number of Students |

0-10 | 5 |

10-20 | 8 |

20-30 | 15 |

30-40 | 9 |

**Solution:**

For the largest value - Take higher limit of the highest class = 40 For the smallest value - Take lower limit of the lowest class = 0 Range = 40 – 0 Range = 40

### Mean Deviation

Range as a measure of dispersion only depends on the highest and the lowest values in the data. Mean deviation on other hand measures the deviation of the observations from the mean of the distribution. Since the average is the central value of the data, some deviation might be positive and some might be negative. If they are added like that, their sum will not reveal much as they tend to cancel each other’s effect. For example,

Consider the data given below,

-5, 10, 25

The mean of this data = 10

Now deviation from the mean for different values is (-5 -10), (10 – 10), (25 – 10) i.e -15, 0, 15

Now adding the deviations, shows that there is zero deviation from the mean which is incorrect. Thus, to counter this problem only the absolute values of the difference are taken while calculating the mean deviation.

So, Mean Deviation (MD) =

**Mean deviation from the** **mean for Ungrouped data:**

For calculating the mean deviation for ungrouped data, the following steps must be followed:

- Calculate the arithmetic mean for all the values of the dataset.
- Calculate the difference between each value of the dataset and the mean. Only absolute values of the differences will be considered. |d|
- Calculate the arithmetic mean of these deviations.

M.D =** **

**Question 3: Calculate the mean deviation for the given ungrouped data: **

**2, 4, 6, 8, 10**

**Solution: **

Following the steps mentioned above,

Mean =

⇒

M. D =

⇒ M.D =

⇒ M.D =

⇒M.D =

⇒ M.D = 2.4

**Mean Deviation from the** **median for Ungrouped Data:**

For calculating the mean deviation for ungrouped data, the following steps must be followed:

- Calculate the median of all the values of the dataset.
- Calculate the difference between each value of the dataset and the median. Only absolute values of the differences will be considered. |d|
- Calculate the arithmetic mean of these deviations.

**Question 4: Calculate the mean deviation from the median for the given ungrouped data: **

**2, 4, 6, 8, 10**

**Solution: **

Following the steps mentioned above,

Median of this is also 6.

M. D =

⇒ M.D =

⇒ M.D =

⇒M.D =

⇒ M.D = 2.4

**Mean deviation from mean for continuous frequency distribution:**

For calculating the mean deviation for ungrouped data, the following steps must be followed:

- Calculate the arithmetic mean for all the values of the dataset.
- Calculate the difference between the middle value of the class interval and the mean. Only absolute values of the differences will be considered. |d|
- Multiply |d| with their corresponding group frequencies.
- Calculate the arithmetic mean of these deviations.

M.D =** **

**Question 5: Calculate the mean deviation for the given data: **

Class Interval | Frequency |

0-10 | 4 |

10-20 | 2 |

20-30 | 4 |

30-40 | 0 |

**Solution: **

Following the steps mentioned above,

Mean =

⇒

M. D =

⇒ M.D =

⇒ M.D =

⇒M.D =

⇒ M.D = 8

**Mean deviation from the** **median for continuous frequency distribution:**

For calculating the mean deviation for ungrouped data, the following steps must be followed:

- Calculate the median for all the values of the dataset.
- Calculate the difference between the middle value of the class interval and median. Only absolute values of the differences will be considered. |d|
- Multiply |d| with their corresponding group frequencies.
- Calculate the arithmetic mean of these deviations.

**M.D = **

*** QuickLaTeX cannot compile formula: *** Error message: Error: Nothing to show, formula is empty

**Question 6: Calculate the mean deviation for the given data: **

Class Interval | Frequency |

0-10 | 7 |

10-20 | 1 |

20-30 | 3 |

30-40 | 0 |

**Solution: **

Following the steps mentioned above,

Median lies in the interval (0-10) so, let’s say 5 is the median.

M. D =

⇒ M.D =

⇒ M.D =

⇒M.D =

⇒ M.D = 4

### Quartile Deviation

The quartile concept is used to divide the data into four parts. It is the same as median where it divides the given data into two equal parts. This quartile concept comes under the subject of statistics which is a study of the collection of data analyzing it, interpreting, presenting organized data.** Quartile Deviation **is defined as half of the distance between the first quartile and the third quartile. It is also known as Semi Interquartile Range.

** Quartile Deviation :****\frac{Q3-Q1}{2}**

**Question 7: Find the first, third Quartile and Quartile Deviation for the data 8, 5,15, 20, 18, 30, 40, 25**

Solution:Step 1:Sort the given data in the ascending order 5, 8, 15, 18, 20, 25, 30, 40.Step 2:Find all Quartiles step by step Quartile-1 =(\frac{(n + 1){4})th term Here n = 8 because there are total 8 numbers in the given data. First Quartile = ( \frac{8 + 1}{4})th term = (\frac{9}{4})th term = 2.25th term 2.25th = 2nd term + (0.25)(3rd term - 2nd term ) = 8+(0.25)(15-8) = 9.75 The First Quartile value is 9.75 Quartile-3 = (\frac{3(n + 1)}{4})th term Third Quartile = (\frac{3(8 + 1)}{4})th term = (\frac{27}{4})th term = 6.75th term 6.75th = 6th term +(0.75)(7th -6th) = 25+ (0.75)(5)= 28.75 So the third Quartile value is 28.75Quaritle Deviation:\frac{Q3-Q1}{2} = \frac{28.75-9.75}{2} = 9.5

### Standard Deviation

The most important and the most powerful measure of the dispersion is the standard deviation ; generally denoted by σ . It is computed as the square root of the mean of the squares of the differences of the variarte value from their mean.

**1. Actual mean method**

** S.D. = √\frac{(∑x-μ) ^{2}}{n}**

**2. Assumed mean method**

**Short-cut method**

**S.D. =** **√(\frac{∑fd ^{2}}{∑f} -(\frac{∑fd}{∑f})^{2})**

**step-deviation method**

**S.D. = h√( ∑{\frac{∑fd2}{∑f} – ( \frac{∑fd}{∑f}) ^{2}) **

**Question 8: Calculate the mean and standard deviation for the following: **

** **

Size of items: | 6 | 7 | 8 | 9 | 10 | 11 | 12 |

Frequency: | 3 | 6 | 9 | 13 | 8 | 5 | 4 |

**Solution: **The calculations are as fellows:

Size of items x | Frequency f | Deviation d = x – 9 | fd | fd^{2} |

6 | 3 | -3 | -9 | 27 |

7 | 6 | -2 | -12 | 24 |

8 | 9 | -1 | -9 | 9 |

a = 9 | 13 | 0 | 0 | 0 |

10 | 8 | 1 | 8 | 8 |

11 | 5 | 2 | 10 | 20 |

12 | 4 | 3 | 12 | 36 |

∑f = 48 | ∑fd= 0 | ∑fd^{2}=124 |

mean : a + \frac{∑fd}{∑f} = 9+ 0 = 9

mean = 9

Standerd deviation : √[\frac{∑fd^{2}} {N} – (\frac{∑fd}{N})^{2}

= √\frac{124}{48} = 1.607

### Variance

Variance refers to a statistical measurement of the spread between numbers in a data set . Variance is often depicted by this symbol: σ^{2}. variance is calculated by doing square of standard deviation .

**Question 9:** Refer question no . 8 , find variance .

solution: variance (σ^{2}) = (standard deviation )^{2}

^{ } = (1.607)^{2} = 2.582

**Relative Measures of Dispersion**

These measures of deviation are expressed in the form of ratios, percentages. For example – Standard Deviation divided by the mean is an example of a relative measure. These measures are always dimensionless and are also known as the coefficient of dispersion. These measures come in handy while comparing the variation of two datasets that have different units. For example, consider two datasets of weights of students. In one dataset, the weight is measured in Kilograms, and in another one, it is measured in grams. Both will have equivalent variation in the values but since the units are different, absolute measures of dispersion will give a very high value for the dispersion in the dataset with weights in grams. Since absolute measures of dispersion are not appropriate in these cases, the relative measures of dispersion are used.

**Coefficient of Range****Coefficient of Quartile****Coefficient of Mean Deviation****Coefficient of Standard Deviation****Coefficient of Variance**

__Coefficient of Range__ : ** \frac{L-S}{L+S} (100)**

**Question 10: ** Find out the range for the following observations.

20, 24, 31, 17, 45, 39, 51, 61

**solution: **Firstly arrange all obersation in ascending order.

17, 20, 24, 31, 39 , 45 , 51 ,61

* Coefficient of Range:* \frac{L-S}{L+S} (100)

= \frac{61-17 }{61+17}(100)

= 54.64 %

__Coefficient of Quartile Deviation__ :** \frac{Q3-Q1}{Q3+Q1} (100) **

here, Q_{1} = First Quartile / lower Quartile

Q_{3} = Third Quartile / upper Quartile

**Question11 : Find the coefficient of Quartile Deviation for the data 8, 5,15, 20, 18, 30, 40, 25**

solution:In Question no. 8 we have already discussed this question till first and third Quartile , Now we just have to put values in formula. Coefficient of Quartile Deviation : \frac{Q3-Q1}{Q3+Q1} (100) = \frac{28.75-9.75}{28.75+9.75} (100) = 24.67%

__Coefficient of Standard Deviation__ : **\frac{σ}{μ}**

here, σ = Standard deviation of the series

μ = Mean of the series

**Question 12: Refers question 8 , find coefficient of standard deviation : **

Solution:Mean = 9 S.D. = 1.607 Coefficient of standard deviation : \frac{σ}{μ} = \frac{1.607}{9} = 0.1188

**Coefficient of variance: ****\frac{σ}{μ} (100) **

**Question13: Refers question 8 , find coefficient of variance: **

Solution:Mean = 9 S.D. = 1.607 Coefficient of variance : \frac{σ}{μ} (100) = \frac{1.607}{9} (100) = 11.88%

* coefficient of mean deviation :* Coefficient of mean deviation calculated by following ways:

- Coefficient of mean deviation ( about mean ) : \frac{ mean deviation from mean} {mean}
- Coefficient of mean deviation (about median) : \frac{ mean deviation from median}{median}
- Coefficient of mean deviation ( about mode) : \frac{ mean deviation form mode}{mode}

**Lorenz Curve**

The Lorenz curve is an important part of economics. It is a representation of the distribution of wealth and income. It was developed by Max.O. Lorenz to represent the inequality of wealth distribution. The figure below shows a typical Lorenz curve. The area enclosed between the straight line and the curved line is called the Gini coefficient. The further away the curved line is from the straight line, the more inequality in the wealth is indicated.

This curve is used in a lot of fields such as ecology, studies of biodiversity, and business modeling.

Gini Coefficient:It is defined as the representation scalar measurement of inequality.

### Sample Problems

**Question 1: Find out the range for the following observations. **

**20, 42, 13, 71, 54, 93, 15, 16**

**Solution:**

The largest value in the given observations is 71 and the smallest value is 13. The Range is 71 – 13 = 58

**Question 2: Find out the range for the following frequency distribution table for the marks scored by class 10 students. **

Marks Intervals | Number of Students |

10-20 | 8 |

20-30 | 25 |

30-40 | 9 |

**Solution:**

For the largest value – Take higher limit of the highest class = 40

For the smallest value – Take lower limit of the lowest class = 10

Range = 40 – 10

Range = 30

**Question 3: Calculate the mean deviation for the given ungrouped data: **

**-5, -4, 0, 4, 5**

**Solution: **

Following the steps mentioned above,

Mean =

⇒

M. D =

⇒ M.D =

⇒ M.D =

⇒M.D =

⇒ M.D = 3.6

**Question 4: Calculate the mean deviation for the given data: **

Class Interval | Frequency |

0-10 | 1 |

10-20 | 1 |

20-30 | 8 |

30-40 | 0 |

**Solution: **

Following the steps mentioned above,

Median lies in the interval (20-30) so, let’s say 25 is the median.

M. D =

⇒ M.D =

⇒ M.D =

⇒M.D =

⇒ M.D = 3