T-Test Approach in R Programming

Last Updated : 08 Jun, 2023

We will be trying to understand the T-Test in R Programming with the help of an example. Suppose a businessman with two sweet shops in a town wants to check if the average number of sweets sold in a day in both stores is the same or not.

So, the businessman takes the average number of sweets sold to 15 random people in the respective shops. He found out that the first shop sold 30 sweets on average whereas the second shop sold 40. So, from the owner’s point of view, the second shop was doing better business than the former. But the thing to notice is that the data set is based on a mere number of random people and they cannot represent all the customers. This is where T-testing comes into play it helps us to understand whether the difference between the two means is real or simply by chance.

Mathematically, what the t-test does is, take a sample from both sets and establish the problem assuming a null hypothesis that the two means are the same.

Classification of T-tests

One Sample T-test
Two sample T-test
Paired sample T-test

One Sample T – Test Approach

The One-Sample T-Test is used to test the statistical difference between a sample mean and a known or assumed/hypothesized value of the mean in the population.

So, for performing a one-sample t-test in R, we would use the syntax t.test(y, mu = 0) where x is the name of the variable of interest and mu is set equal to the mean specified by the null hypothesis.

For Example:

R

set.seed(0)
sweetSold <- c(rnorm(50, mean = 140, sd = 5))
 
# mu=The hypothesized mean difference between the two groups.
t.test(sweetSold, mu = 150)

Output:

    One Sample t-test

data:  sweetSold
t = -15.249, df = 49, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 150
95 percent confidence interval:
 138.8176 141.4217
sample estimates:
mean of x 
 140.1197

t = -15.249, df = 49, and a 2.2e-16 p-value: provides the p-value, degrees of freedom (df), and test statistic (t). The computed t-value in this instance is -15.249, there are 49 degrees of freedom, and the p-value is very small ( 2.2e-16), indicating strong evidence that the null hypothesis is false.
The true mean is not equal to 150, as an alternative explains the alternative theory, which contends that the population’s actual mean is not 150.
The confidence interval, which ranges from 138.8176 to 141.4217, shows that there is a 95% chance that the genuine population mean is located between those two numbers.
provides the sample estimate, in this example the sample mean (x) of 140.1197, or “sample estimates: mean of x 140.1197.”

Two sample T-Test Approach

It is used to help us to understand whether the difference between the two means is real or simply by chance.
The general form of the test is t.test(y1, y2, paired=FALSE). By default, R assumes that the variances of y1 and y2 are unequal, thus defaulting to Welch’s test. To toggle this, we use the flag var.equal=TRUE.

For Example:

R

set.seed(0)
 
shopOne <- rnorm(50, mean = 140, sd = 4.5)
shopTwo <- rnorm(50, mean = 150, sd = 4)
 
t.test(shopOne, shopTwo, var.equal = TRUE)

Output:

    Two Sample t-test

data:  shopOne and shopTwo
t = -13.158, df = 98, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -11.482807  -8.473061
sample estimates:
mean of x mean of y 
 140.1077  150.0856

Sample estimates: 140.1077 for the mean of x and 150.0856 for the mean of y the sample means (x and y), which are the sample estimates. In this instance, shopOne’s mean is 140.1077, whereas shopTwo’s mean is 150.0856.

Paired Sample T-test

This is a statistical procedure that is used to determine whether the mean difference between two sets of observations is zero. In a paired sample t-test, each subject is measured two times, resulting in pairs of observations.

The test is run using the syntax t.test(y1, y2, paired=TRUE)

For Example:

R

set.seed(2820)
 
sweetOne <- c(rnorm(100, mean = 14, sd = 0.3))
sweetTwo <- c(rnorm(100, mean = 13, sd = 0.2))
 
t.test(sweetOne, sweetTwo, paired = TRUE)

Output:

    Paired t-test

data:  sweetOne and sweetTwo
t = 29.31, df = 99, p-value < 2.2e-16
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
 0.9892738 1.1329434
sample estimates:
mean difference 
       1.061109

estimations from samples: mean difference The sample estimate, in this case, the mean difference between the paired samples, is given by the number 1.061109. A mean difference of 1.061109 is estimated.

Differences between one-sample, two-sample, and paired-sample t-tests:

One-sample t-test	Two-sample t-test	Paired sample t-test
Purpose: Determines whether a single sample’s mean deviates considerably from a given population mean.	Purpose: Determines whether there is a substantial difference between the means of two independent groups.	Purpose: Determines whether the means of two connected or paired samples differ significantly from one another.
Data: Analyses a single set of measurements or observations.	Data: Compares two distinct groups’ or samples’ means.	Data: Analyses the identical group or set of observations made under two distinct situations or at two different times.
Hypotheses: Tests whether a population mean is hypothesized to be significantly different from the sample mean.	Hypotheses: Determines whether there is a significant difference between the two groups’ means.	Hypotheses: Tests hypotheses to determine if the mean difference between the paired samples differs noticeably from zero.
Assumptions: Assumes that observations are independent and that the data is regularly distributed.	Assumptions: Assumes that observations are unrelated to one another, that data in each group is normally distributed, and that the variances of the two groups may or may not be equal.	Assumptions: Assumes that the paired observations are dependent or matched pairs, that the differences have a fixed variance, and that the differences are normally distributed.
Example: Examining whether a class’s average test scores considerably deviate from the average test score for the country, for instance.	Example: Using the average heights of male and female people to determine whether there is a noticeable difference between the two groups.	Example: Comparing measures taken from the same group of people before and after a new treatment can help determine whether it has a discernible impact.