Statistics is a branch of science that is used for data collection, evaluation, and summarising. It summarises the data in a mathematical format. Statistics is used mainly to gain an understanding of the data and focus on various applications. It is used to collect facts and figures about the data set specified using the set of numbers. Mathematical statistics applies mathematical techniques like linear algebra, differential equations, mathematical analysis, and theories of probability.
There are two methods of analyzing data in mathematical statistics that are used on a large scale:
- Descriptive Statistics
- Inferential Statistics
Some of the important formulae used in statistics
Mean
Also known as the arithmetic mean, it is calculated by computing the average of a given set of numbers. It is the summation of all the given data values divided by the total number of data values given in the set. It is calculated in the following way:
Mean Formula
Mean of a given data set is specified by the following formula,
where,
- x̄ = the mean value of the set of given data.
- f = frequency of each class
- x = mid-interval value of each class
Hence, the average of all the data points is termed as mean.
Median
The median of the given set of numbers is calculated as the middle-most observation. This value is obtained after arranging the data in ascending order. The median of the data is a measure of the central tendency of the data and therefore, is useful for data analysis. Also known as the Place Average, the median is an easy metric to calculate. It is the data placed in the middle of a specified data sequence.
Median Formula
In order to find the median of the data set, the numbers are first arranged in ascending order. The middle value is then calculated from the following.
Odd number of observations
In case the total number of observations contained in the data set is odd, then the median formula is as follows:
where n is the number of observations
Even number of observations
In case the total number of observation contained in the data set is even, then the median formula is as follows:
where, n is the number of observations
Mode
In statistical data analysis, the mode of a given data set is the repeatedly occurring value in a given set of values. It corresponds to the value that occurs the maximum number of times. It is the value that has the highest frequency among other sets of numbers.
It is the value that appears the most number of times.
For instance, In the given set of numbers: 8, 9, 10, 10, 5, 10, the mode of the given data set of integers is 10 since it occurs the maximum number of times, that is three times.
Mode formula for ungrouped data
The computation of ungrouped data requires the arrangement of data values either in ascending or descending order. The repeated values are then found and captured along with their frequency. Now, the captured observation with the highest frequency is the modal value for the given data. This is the calculated modal value.
Mode formula for grouped data
In this formula, we have,
- I0 is the lower limit of the modal class
- h is the size of the class interval
- f1 is the frequency of the modal class
- f0 is the frequency of the class preceding the modal class
- f2 is the frequency of the class succeeding the modal class
Standard deviation
Standard deviation is a measure of the degree of dispersion of the values forming the data set. It is the measurement of scatter relative to its corresponding value.
It is used in descriptive statistics. It is an indicator of the measure of the variation of the data points from the mean of the data points. The standard deviation of a sample is computed as the square root of its variance.
Standard Deviation Formula
Population standard deviation
In this formula, we have,
- σ = Population standard deviation
- N = Number of observations in population
- Xi = ith observation in the population
- μ = Population mean
Sample standard deviation
In this formula, we have,
- s = Sample standard deviation
- n = Number of observations in sample
- xi = ith observation in the sample
- = Sample mean
Variance
The variance of data distribution is a measure of how data points differ from the mean. It is an indicator of the measure of how much far a set of the numbers are spread out from their corresponding average value. The variance of the data is considered to be double of standard deviation.
It is used to compute the expected difference of deviation from the actual value. Variance is dependent on the standard deviation of the specified data set of the observations. This implies that if the variance is more, the data values are more spread out from the mean and similarly if the variance is less, the data values are less spread out from the mean. Therefore, it measures the scatter of data from the mean of the dataset.
Variance Formula
Population variance
In this formula, we have,
- σ = Population standard deviation
- N = Number of observations in population
- Xi = ith observation in the population
- μ = Population mean
Sample variance
In this formula, we have,
- s = Sample standard deviation
- n = Number of observations in sample
- xi = ith observation in the sample
- = Sample mean
Sample Questions
Question 1. Find the mean of the class test marks of 10 students out of 100
99, 95, 87, 55, 72, 86, 92, 89, 75, 88
Solution:
To find the mean of the data first we need to find the sum of all the marks of students
Sum of observations = 99+95+87+55+72+86+92+89+75+88 = 838
Number of observations = 10
Therefore,
= 83.8
Therefore,
Mean of the marks of 10 students is 83.8
Question 2. Find the median of the following data
2, 45, 15, 18, 11, 85, 19, 22, 7, 5, 13
Solution:
First arrange this data in ascending order
2, 5, 7, 11, 13, 15, 18, 19, 22, 45, 85
Here as we can see that the number of observation is 11 that is odd.
So apply formula of median when the number of observation is odd.
Here n = number of observation that is 11, n = 11
Median = 6th term
That is 15
Therefore,
The median of the data is 15.
Question 3. Find the mode of marks obtained by students is class test out of 50 for 40 students
Marks obtained | Number of students |
10-20 | 4 |
20-30 | 8 |
30-40 | 16 |
40-50 | 12 |
Solution:
To find the mode use the formula of mode for grouped data
Here we have,
f1 = The maximum class frequency = 16
The class interval of f1 = 30-40
l0 = Lower limit of the maximum frequency ( modal class ) = 30
h = Size of the class interval = 10
f0 = Frequency of the preceding class = 8
f2 = Frequency of the succeeding class = 12
Now put all these values in the mode formula for grouped data
= 30+ 6.66
= 36.66
Therefore,
Mode = 36.66
Question 4. Assume there are 40 students in a class. Randomly 5 students were selected and their heights were measured as 167, 162, 160, 159, 169. Calculate the standard deviation of their heights?
Solution:
Here,
N = 5
Mean =
= 163.4
Standard Deviation (S.D) =
Standard Deviation = 4.393
Question 5. In the above example find the variance of the heights of the 5 selected students?
Solution:
As we know that,
Variance = S.D2
Standard Deviation = 4.393
Variance = 4.3932
Variance = 19.298
Share your thoughts in the comments
Please Login to comment...