**Prerequisites – **Hypothesis Testing, p-value

A t-test is a type of inferential statistic used to determine if there is a significant difference between the means of two groups, which may be related in certain features.

If t-value is large => the two groups belong to different groups.

If t-value is small => the two groups belong to same group.

**Terminologies involved**

**Degree of freedom (df) –**It tells us the number of independent variables used for calculating the estimate between 2 sample groups.**[Eq-2]**

where,df =degree of freedomnsize of the sample S_{S =}

Suppose, we have 2 samples A and B. The df would be calculated as

df = (n_{A}-1) + (n_{B }-1)

**Significance level (α) –**It is the probability of rejecting the null hypothesis when it is true. In simpler terms, it tells us about the percentage of risk involved in saying that a difference exist between two groups, when in reality it does not.

There are three types of t-tests, and they are categorized as dependent and independent t-tests.

**Independent samples t-test:**compares the means for two groups.**Paired sample t-test:**compares means from the same group at different times (say, one year apart).**One sample t-test test:**the mean of a single group against a known mean.

**1. Independent sample t-test**

Independent sample t-test, commonly known as unpaired sample t-test is used to find out if the differences found between two group is actually significant or just a random occurrence.

**We can use this when:**

- the population mean or standard deviation is unknown. (information about population is unknown)
- the two samples are separate/independent. For eg. boys and girls (the two are independent of each other)

**Formula used:**

where,t =t-valueA =Sample of AB =Sample of BμMean of sample A_{A =}μMean of sample B_{B =}nsamele size of A_{A =}nsample size of B_{B =}df =degree of freedom

**Steps involved**

Step 1 -Find the sum of all values in each sample.Step 2 -Square the sum values found in step 1.Step 3 -Find the sum of square of individual values in each sample.Step 4 -Calculate the mean of each sample.Step 5 -Find the degree of freedom(df)usingEq-2.Step 6 -Insert all the values found inSteps 1-4intoEq-3and find the calculated t-value.Step 7 -Use the values of df and α (take α = 0.05 if not given) in the two-tails t-table(Click here)to find the table value of t.Step 8 -Compare values of t found inStep-6andStep-7.

**Interpreting the results**

Ift_{cal > ttable}_{ }=>p < (α=0.05)=> significant difference between two groups found. Ift=>_{cal < ttable}p > (α=0.05)=> no significant difference between two groups.

**Example Problem (Step by Step)**

Suppose, two independent sample data A and B is given, with the following values. We have to perform the Independent samples t-test for this data.

Sample A | Sample B |
---|---|

1 | 1 |

2 | 2 |

4 | 2 |

4 | 3 |

5 | 3 |

5 | 4 |

6 | 5 |

7 | 6 |

8 | 7 |

8 | 7 |

Step 1 -∑A =1 + 2 + 4 + 4 + 5 + 5 + 6 + 7 + 8 + 8 = 50∑B =1 + 2 + 2 + 3 + 3 + 4 + 5 + 6 + 7 + 7 = 40

Step 2 -(∑A)(50)^{2 =}^{2 }= 2500(∑B)(40)^{2 =}^{2 }= 1600

Step 3 -∑A1^{2 =}^{2}+ 2^{2}+ 4^{2}+ 4^{2}+ 5^{2}+ 5^{2}+ 6^{2}+ 7^{2}+ 8^{2}+ 8^{2}= 300∑B1^{2 =}^{2}+ 2^{2}+ 2^{2}+ 3^{2}+ 3^{2}+ 4^{2}+ 5^{2}+ 6^{2}+ 7^{2}+ 7^{2}= 202

Step 4 -n =10μ50/10 = 5_{A = (∑A / n) = }μ40/10 = 4_{B = (∑B / n) = }

Step 5 -df = (nA - 1) + (nB - 1) =(10-1) + (10-1) = 18[using Eq-2]

Step 6 -Putting values found inEq-3to find the calculated value of t. we get,t_{cal = 0.99}

Step 7 -Let value of α = 0.05 and df = 18. Looking up the two-tailed t-table.(See table below or refer link above)we get,t_{table = 2.10}

(df)/(α) | 0.2 | 0.10 | 0.05 | . . |
---|---|---|---|---|

∞ | 1.282 | 1.645 | 1.960 | . . |

1 | 3.078 | 6.314 | 12.706 | . . |

2 | 1.886 | 2.920 | 4.303 | . . |

: | : | : | : | . . |

8 | 1.397 | 1.860 | 2.306 | . . |

9 | 1.383 | 1.833 | 2.262 | . . |

: | : | : | : | . . |

18 | 1.330 | 1.734 | 2.101 | . . |

19 | 1.328 | 1.729 | 2.093 | . . |

20 | 1.325 | 1.725 | 2.086 | . . |

: | : | : | : | . . |

Step 8 -0.99 < 2.10(tby 1.11_{cal < ttable })=> no significant difference found between two groups.

**2. Paired sample t-test**

Paired sample t-test, commonly known as dependent sample t-test is used to find out if the difference in the mean of two samples is 0. The test is done on dependent samples, usually focusing on a particular group of people or thing. In this, each entity is measured twice, resulting in a pair of observations.

**We can use this when:**

- Two similar (twin like) samples are given. [Eg, Scores obtained in English and Math (both subjects)]
- The dependent variable (data) is continuous.
- The observations are independent of one another.
- The dependent variable is approximately normally distributed.

**Formula Used**

where,t =t-valueD =difference between the two samples (A-B)N =sample size (same as n)

**Steps Involved**

Step 1 -Find the sum of differnce of each two samples in data. [∑D = ∑(A-B)]Step 2 -Find the sum of sqaure of each D found in Step 1. [(∑D]^{2)}Step 3 -Find the sqaure of summation of D. [(∑D)]^{2}Step 4 -Put the values found from Steps 1-3 inEq-4and find the t-value.Step 5 -Find the degree of freedom(df)using Eq-2.

NOTE :Here, df is calculated as a whole for the data, not for each individual sample set. This is because the two samples A and B are twin like. (similar)

So, df = ∑(n_{S}– 1) = N-1

Step 6 -Use the values ofdfandα(take α = 0.05 if not given) in the two-tails t-table (Click here) to find the table value of t.Step 7 -Compare values of t found inStep-4andStep-6.

**Interpretation of Results **

Same as that of Independent samples t-test.

**Example Problem (Step by Step)**

Consider the following example. Scores (out of 25) of the subjects Math and SST are taken for a sample of 10 students. We have to perform the paired sample t-test for this data.

Student no. | Math | SST | Step 1 | Step 2 |
---|---|---|---|---|

1 | 4 | 15 | -11 | 121 |

2 | 4 | 16 | -12 | 144 |

3 | 7 | 14 | -7 | 49 |

4 | 16 | 14 | 2 | 4 |

5 | 20 | 22 | -2 | 4 |

6 | 11 | 22 | -11 | 121 |

7 | 13 | 23 | -10 | 100 |

8 | 9 | 18 | -9 | 81 |

9 | 11 | 18 | -7 | 49 |

10 | 15 | 19 | -4 | 16 |

Sum – | (∑D) = -71 | ∑D ^{2} = 689 |

Step 1 and Step 2 -as shown in table above.

Step 3 -(∑D)= (71)^{2}^{2}= 5041

Step 4 -Putting values inEq-4, we gett_{cal = -4.96}

Step 5 -df = n -1 = 10 - 1 = 9

Step 6 -Using df = 9 and α = 0.05 in table. We get,t_{table = 2.26}

Step 7 --4.96 < 2.26(tcal < ttable by 7.22)=> no significant difference found between two groups.

**3. One sample t-test**

One sample t-test is one of the widely used t-tests for comparison of the sample mean of the data to a particular given value. Used for comparing the sample mean to the true/population mean.

**We can use this when:**

the sample size is small. (under 30) data is collected randomly. data is approximately normally distributed.

**Formula used:**

where,t =t-valuex_bar =sample meanμ =true/population meanσ =standard deviationn =sample size

**Steps involved**

Step 1 -Define the null(hand alternative_{0)}(hhypothesis._{1)}Step 2 -Calculate sample mean. (if not given)[population mean, standard deviation, n is given]Step 3 -Put the values found inStep 1intoEq-5and calculate t-value.(t_{cal)}Step 4 -Calculate degree of freedom(df).(same as done in paired sample t-test)Step 5 -Take α = 0.05 if not given. Use the value of df and α and findt_{table}_{ }from one tailed t-table. (Click here)Step 6 -Compare values of t found inStep-3andStep-5.

**Interpretation of Results**

Same as that of Independent samples t-test.

**Example Problem (Step by Step)**

Consider the following example. The weights of 25 obese people were taken before enrolling them into the nutrition camp. The population mean weight is found to be 45 kg before starting the camp. After finishing the camp, for the same 25 people, the sample mean was found to be 75 with a standard deviation of 25. Did the fitness camp work?

Step 1 -h0 -> μ = 45(sample mean is true mean)h1 -> μ ≠ 45(sample mean is not true mean)

Step 2 -Given,x_bar =75μ =45σ =25n =25

Step 3 -Putting the values fromStep 2inEq-5. we get,t_{cal = 6}

Step 4 -df = n - 1 = 24

Step 5 -Using df = 24 and α = 0.05 in table. We get,t_{table = 1.711}

Step 6 -6 > 1.711(tcal > ttable)=> significant difference found between two groups.=> the nutrition camp significantly impacted the weights and it was a success.

The above discussed types of t-tests are widely used in the fields of research in hospitals by experts to gain important information about the medical data given to them about effects of various medicines and drugs on the population and helps them draw out important inferences regarding the same. However, it is the responsibility of the person to see to it that which t-test would bring out the best results and that all the assumptions of that t-test are adhered to. For any doubt/query, comment below.