# What are some of the important formulae used in statistics?

Statistics is a branch of science that is used for data collection, evaluation, and summarising. It summarises the data in a mathematical format. Statistics is used mainly to gain an understanding of the data and focus on various applications. It is used to collect facts and figures about the data set specified using the set of numbers. Mathematical statistics applies mathematical techniques like linear algebra, differential equations, mathematical analysis, and theories of probability.

There are two methods of analyzing data in mathematical statistics that are used on a large scale:

- Descriptive Statistics
- Inferential Statistics

**Some of the important formulae used in statistics**

**Mean**

Also known as the arithmetic mean, it is calculated by computing the average of a given set of numbers. It is the summation of all the given data values divided by the total number of data values given in the set. It is calculated in the following way:

**Mean Formula**

Mean of a given data set is specified by the following formula,

where,

- x̄ = the mean value of the set of given data.
- f = frequency of each class
- x = mid-interval value of each class

Hence, the average of all the data points is termed as mean.

**Median**

The median of the given set of numbers is calculated as the middle-most observation. This value is obtained after arranging the data in ascending order. The median of the data is a measure of the central tendency of the data and therefore, is useful for data analysis. Also known as the Place Average, the median is an easy metric to calculate. It is the data placed in the middle of a specified data sequence.

**Median Formula**

In order to find the median of the data set, the numbers are first arranged in ascending order. The middle value is then calculated from the following.

**Odd number of observations**

In case the total number of observations contained in the data set is odd, then the median formula is as follows:

where n is the number of observations

**Even number of observations**

In case the total number of observation contained in the data set is even, then the median formula is as follows:

where, n is the number of observations

**Mode**

In statistical data analysis, the mode of a given data set is the repeatedly occurring value in a given set of values. It corresponds to the value that occurs the maximum number of times. It is the value that has the highest frequency among other sets of numbers.

It is the value that appears the most number of times.

For instance, In the given set of numbers: 8, 9, 10, 10, 5, 10, the mode of the given data set of integers is 10 since it occurs the maximum number of times, that is three times.

**Mode formula for ungrouped data**

The computation of ungrouped data requires the arrangement of data values either in ascending or descending order. The repeated values are then found and captured along with their frequency. Now, the captured observation with the highest frequency is the modal value for the given data. This is the calculated modal value.

**Mode formula for grouped data**

In this formula, we have,

- I
_{0}is the lower limit of the modal class- h is the size of the class interval
- f
_{1}is the frequency of the modal class- f
_{0}is the frequency of the class preceding the modal class- f
_{2}is the frequency of the class succeeding the modal class

**Standard deviation**

Standard deviation is a measure of the degree of dispersion of the values forming the data set. It is the measurement of scatter relative to its corresponding value.

It is used in descriptive statistics. It is an indicator of the measure of the variation of the data points from the mean of the data points. The standard deviation of a sample is computed as the square root of its variance.

**Standard Deviation Formula**

**Population standard deviation**

In this formula, we have,

- σ = Population standard deviation
- N = Number of observations in population
- X
_{i}= i^{th}observation in the population- μ = Population mean

**Sample standard deviation**

In this formula, we have,

- s = Sample standard deviation
- n = Number of observations in sample
- x
_{i}= i^{th}observation in the sample- = Sample mean

**Variance**

The variance of data distribution is a measure of how data points differ from the mean. It is an indicator of the measure of how much far a set of the numbers are spread out from their corresponding average value. The variance of the data is considered to be double of standard deviation.

It is used to compute the expected difference of deviation from the actual value. Variance is dependent on the standard deviation of the specified data set of the observations. This implies that if the variance is more, the data values are more spread out from the mean and similarly if the variance is less, the data values are less spread out from the mean. Therefore, it measures the scatter of data from the mean of the dataset.

**Variance Formula**

**Population variance**

In this formula, we have,

- σ = Population standard deviation
- N = Number of observations in population
- X
_{i}= i^{th}observation in the population- μ = Population mean

**Sample variance**

In this formula, we have,

- s = Sample standard deviation
- n = Number of observations in sample
- x
_{i}= i^{th}observation in the sample- = Sample mean

### Sample Questions

**Question 1. Find the mean of the class test marks of 10 students out of 100**

**99, 95, 87, 55, 72, 86, 92, 89, 75, 88**

**Solution:**

To find the mean of the data first we need to find the sum of all the marks of students

Sum of observations = 99+95+87+55+72+86+92+89+75+88 = 838

Number of observations = 10

Therefore,

= 83.8

Therefore,

Mean of the marks of 10 students is 83.8

**Question 2. Find the median of the following data**

**2, 45, 15, 18, 11, 85, 19, 22, 7, 5, 13**

**Solution:**

First arrange this data in ascending order

2, 5, 7, 11, 13, 15, 18, 19, 22, 45, 85

Here as we can see that the number of observation is 11 that is odd.

So apply formula of median when the number of observation is odd.

Here n = number of observation that is 11, n = 11

Median = 6

^{th}termThat is 15

Therefore,

The median of the data is 15.

**Question 3. Find the mode of marks obtained by students is class test out of 50 for 40 students**

Marks obtained | Number of students |

10-20 | 4 |

20-30 | 8 |

30-40 | 16 |

40-50 | 12 |

**Solution:**

To find the mode use the formula of mode for grouped data

Here we have,

f

_{1}= The maximum class frequency = 16The class interval of f

_{1}= 30-40l

_{0}= Lower limit of the maximum frequency ( modal class ) = 30h = Size of the class interval = 10

f

_{0}= Frequency of the preceding class = 8f

_{2}= Frequency of the succeeding class = 12Now put all these values in the mode formula for grouped data

= 30+ 6.66

= 36.66

Therefore,

Mode = 36.66

**Question 4. Assume there are 40 students in a class. Randomly 5 students were selected and their heights were measured as 167, 162, 160, 159, 169. Calculate the standard deviation of their heights?**

**Solution:**

Here,

N = 5

Mean =

= 163.4

Standard Deviation (S.D) =

Standard Deviation = 4.393

**Question 5. In the above example find the variance of the heights of the 5 selected students?**

**Solution:**

As we know that,

Variance = S.D^{2}

Standard Deviation = 4.393

Variance = 4.393^{2}

Variance = 19.298