Standard deviation and variance are the two most commonly used measures of spread in sets of values. The standard deviation (σ) of a set of numbers is the degree to which these numbers are spread out. The value of standard deviation is obtained by calculating the square root of the variance. The variance of a set of numbers is the average degree to which each of the values in the set is deviated from the mean. In other words, it is equal to the mean of the squared differences of the values from their mean.

**Standard Deviation and Variance of Ungrouped Data **

The variance of ungrouped data is calculated as follows:

- Calculate the mean of the values provided.
- Calculate the difference between each value and the mean. This difference is also known as the deviation about the mean.
- Square each of the values obtained in step 2 and sum all the squared values.
- Divide the calculated sum by the mean.

The formula used to calculate the variance is shown below:

where x̄ is the mean and n is the number of values in the set.

To calculate the standard deviation (σ), we first calculate the variance using the previous steps then calculate its square root:

**Measures of Dispersion: Range, Deviation**,** and Variance**

Statistical dispersion is the degree to which a set of values is spread out. Variance, standard deviation, and range, which is the difference between the largest and smallest value in a dataset, are all examples of measures of dispersion. The larger the range, standard deviation, and variance, the larger the dispersion of the values.

### Sample Problems on Range, Variance, and Standard Deviation

The following examples illustrate these three concepts. We assume two sets of random numbers: Set1 = {1, 3, 7, 9, 11, 15}, Set2 = {10, 20, 33, 67, 82}

**Example 1:** This example explains how to calculate the range of a dataset.

**Solution:**

- The range is the difference between the highest value and the lowest value for a given set of values.
- In Set1, the largest value is 15 and the smallest value is 1. Therefore, the range of Set1 is 15 – 1 = 14.
- In Set2, the largest value is 82 while the smallest value is 10. therefore , the range is 82 – 10 = 70.

We conclude that Set2 has a higher dispersion because it has a higher range.

**Example 2:** This example explains how to calculate the variance of a dataset

**Solution:**

- To calculate the variance of Set1, we first have to calculate the mean:

M1 = (1 + 3 + 7 + 9 + 11 + 15) / 6 = 23/3 = 7.7- The deviation of the values 1, 3, 7, 9, 11, 15 from the mean, respectively, are: 6.7, 4.7, 0.7, 1.3, 3.3, 7.3.

V1 = ((6.7)^2 + (4.7)^2 + (0.7)^2 + (1.3)^2 + (3.3)^2 + (7.3)^2) / 6 = 133.34 / 6 = 22.2- To calculate the variance of Set2, we first have to calculate the mean:

M2 = (10 + 20 + 33 + 67 + 82) / 5 = 42.4- The deviation of the values 10, 20, 33, 67, 82 from the mean, respectively, are: 32.4, 22.4, 9.4, 24.6, 39.6

V1 = ((32.4)^2 + (22.4)^2 + (9.4)^2 + (24.6)^2 + (39.6)^2) / 5 = 3813.2 / 5 = 762.64- We conclude that Set2 has a higher dispersion because it has a higher variance.

**Example 3: **This example explains how to calculate the standard deviation.

**Solution:**

- From the values of V1 and V2 obtained in the previous example, we calculate:

σ1 = √(22.2) = 4.7

σ2 = √(762.64) = 27.6- We conclude that Set2 has a higher dispersion because it has a higher standard deviation.

## Range and Mean Deviation for Grouped Data

Grouped data is classified into two types: The first one is continuous frequency distribution, where the values are grouped into intervals, and each interval is associated with a frequency value. The second type is discrete frequency distribution, where each value is associated with a frequency value.

**Range**

- To calculate the range of a continuous frequency distribution, we calculate the difference between the lower limit of the minimum interval and the upper limit of the maximum interval. Assuming the minimum interval is (a -f) and the maximum interval is (v – z):
- For a discrete frequency distribution, we simply calculate the difference between the smallest value (S) and the largest value (L):

**Mean**

- To calculate the mean of a continuous frequency distribution, we take the values at the centers of each interval, then we multiply each of these values to the frequency value of their interval. Then, we sum the values and multiply the sum by the total number of values (the sum of all frequency values). The following formula is used:
- The calculation of the mean for a discrete frequency distribution is the same as that of continuous frequency distribution, but with one difference. Discrete frequency distribution has discrete values instead of intervals. Therefore, instead of taking the value at the center of an interval, we take each discrete value, multiply it with its frequency value, then sum these products and divide them by the total frequency value. The same formula is used. however, in this case, xi is the discrete value i and fi is the frequency of the discrete value i.

**Mean Deviation**

- To calculate Mean Deviation for a continuous frequency distribution, we calculate the differences between each interval’s mid-point and the mean. then, we multiply each difference by the interval’s frequency and sum all the produced values. finally, we divide the sum by the total number of values (total frequency). the following formula is used:
- The calculation of the mean deviation for a discrete frequency distribution is the same as that of continuous frequency distribution, but instead of taking the value at the center of an interval, we take each discrete value, calculate the difference between the value and the mean, multiply the difference with the frequency of the discrete value, then sum these products and divide them by the total frequency value. The same formula is used. however, in this case, xi is the discrete value i and fi is the frequency of the discrete value i.

**Calculation of Mean, Median**,** and Mode**

The mean, median, and mode can tell us which value can represent the data set, each in a different way. These three measures of central tendency are explained below:

- To calculate the mean, we divide the sum of the values by the number of the given values.

- The median is basically the number at the center of the dataset when the set is arranged in ascending or descending order. In a dataset with a number of values n:

if n is an odd number, we calculate (n-1 /2). The value at the resulting index is the median, considering the index of the first value is 1, the second is 2 and so on.

if n is an even number, the values at the indexes (n/2) and (n/2 +1) are summed and the sum is divided by 2 to get the average value. This value is the median of the set - The mode is the number that is the most frequent in a set of values

### Sample Problems

**Problem 1: Given a set of ungrouped values {7, 8, 3, 6, 7, 8, 9, 7, 5, -2}. Calculate the mean, median, and mode of this set. **

**Solution:**

Mean:

We first sum the values: sum = 7 + 8 + 3 + 6 + 7 + 8 + 9 + 7 + 5 + -2 = 58

We have n = 10 values in the set. Therefore, we divide the sum by 10.

Mean = 58/10 = 5.8

Median:

We first arrange the values in ascending order:

-2, 3, 5, 6, 7, 7, 7, 8, 8, 9

The number of values here is n = 10. Therefore, we take the value at n/2 = 10/2 = 5, which is 7, and the value at n/2 +1 = 10/2 + 1 = 6, which is also . the average value is 7+7/2 = 7.

Median = 7

Mode:

We can see that the number 7 is repeated 3 times in the set, 8 is repeated twice, and the rest of the values are repeated once. Therefore, the most frequent value is 7.

Mode = 7

**Example 2**: **Given a set of ungrouped values {1, 4, 9, 9, 6, 30, 21, 6, 1}. Calculate the mean, median, and mode of this set. **

**Solution:**

Mean:

Sum = 1 + 4 + 9 + 9 + 6 + 30 + 21 + 6 +1 = 87

Mean = 87/9 = 9.7

Median:

The values in ascending order: 1, 1, 4, 6, 6, 9, 9, 21, 30

The number of values here is n = 9. Therefore, we take the value at (n/2) + 1 = 4 + 1 = 5, which is 6.

Median = 6

Mode:

The numbers 1, 6, and 9 are each repeated twice in the set, 8 is repeated twice, while the rest of the values are only repeated once. Therefore, we have multiple values of mode. The set is trimodal, meaning that it has three modes.

Mode= 1, 6, 9

**Example 3**: **Given a set of grouped data with continuous frequency distribution: **

Interval (class) | Frequency |
---|---|

2-4 | 3 |

4-6 | 4 |

6-8 | 2 |

**Calculate the range, mean, and mean deviation. **

**Solution:**

Range:

The lowest value in the lowest interval = 2, and the highest value in the highest interval = 8

Range = 8 – 2 = 6

Mean:

Center values for each interval (respectively): 3, 5, 7

Sum of each center value multiplied by its frequency = 3*3 + 5*4 + 7*2 = 43

Mean = 43/(3 + 4 + 2) = 4.8

Mean Deviation:

Difference between each mid-point and the mean (respectively): |3 – 4.8| = 1.8, 5 – 4.8 = 0.2, 7 – 4.8 = 2.2

Sum of differences multiplied by the frequencies: 1.8*3 + 0.2*4 + 2.2*2 = 10.6

Mean Deviation = 10.6 / 9 = 1.2

**Example 4**: **Given a set of grouped data with discrete frequency distribution: **

Value(class) | Frequency |
---|---|

1 | 3 |

5 | 4 |

7 | 2 |

**Calculate the range, mean, and mean deviation. **

**Solution:**

Range:

The lowest value = 1, and the highest value in the highest interval = 7

Range = 7 – 1= 6

Mean:

Sum of each discrete value multiplied by its frequency = 1*3 + 5*4 + 7*2 = 37

Mean = 37/(3 + 4 + 2) = 4.1

Mean Deviation:

Difference between each value and the mean (respectively): |1 – 4.1| = 3.1, 5 – 4.1 = 0.9, 7 – 4.1 = 2.9

Sum of differences multiplied by the frequencies: 3.1*3 + 0.9*4 + 4.1*2 = 21.1

Mean Deviation = 21.1 / 9 = 2.3

Attention reader! Don’t stop learning now. Get hold of all the important DSA concepts with the **DSA Self Paced Course** at a student-friendly price and become industry ready.