One-Sample T-test in R

Last Updated : 08 Nov, 2023

A statistical method for determining if a sample’s mean significantly deviates from an assumed or known population mean is the one-sample t-test. In R Programming Language it can be complicated, hypothesis testing requires it. It functions by enabling analysts and researchers to extrapolate population parameters from sample data. In this tutorial, we’ll go into mathematical ideas, offer graphical examples to aid in understanding and walk through several different examples of one-sample t-tests in R along with thorough explanations.

One-Sample T-test

The one-sample t-test is based on the t-distribution and is commonly used when dealing with small sample sizes or when the population standard deviation is unknown. The formula for the one-sample t-test statistic is:

$t= \frac{\overline{X}-μ}{s/\sqrt{n}}$

Where:

x̄: Sample mean.
μ: Hypothesized population mean.
s: Sample standard deviation.
n: Number of observations in the sample.

The t-test statistic measures how many standard errors the sample mean (x̄) is away from the hypothesized population mean (μ). A larger t-value indicates a larger difference between the sample mean and the hypothesized population mean.

The t-value calculated from your sample data will indicate how far your sample mean is from the population mean in terms of standard errors. Larger t-value places your sample mean further from the peak of the distribution, suggesting a more significant difference.

Null and Alternative Hypotheses

Before conducting a one-sample t-test, it’s essential to establish the null hypothesis (H0) and the alternative hypothesis (Ha). In the context of a one-sample t-test:

Null Hypothesis (H0): There is no significant difference between the sample mean and the population mean.
Alternative Hypothesis (Ha): There is a significant difference between the sample mean and the population mean.

p-value

The p-value is a crucial output of the one-sample t-test. It represents the probability of observing the sample mean (or something more extreme) if the null hypothesis is true. A small p-value (typically less than 0.05) suggests strong evidence against the null hypothesis, indicating that the sample mean is significantly different from the population mean.

One sample t-test in R

Imagine you are conducting research on the heights of individuals. You want to determine if the sample mean height you’ve collected from 15 individuals significantly differs from the known population mean height of 170 cm.

Step 1: Define Your Data

You collect the heights of 15 individuals.

R

# Sample data
heights <- c(165, 168, 172, 170, 169, 171, 174, 168, 166, 170, 175, 172, 169, 167, 170)

Step 2: Set Up Hypotheses

Null Hypothesis (H0): The sample mean is equal to the known population mean (μ).

Alternative Hypothesis (Ha): The sample mean is not equal to the known population mean (μ).

Step 3: Perform the One-Sample T-test

Execute the one-sample t-test in R using the t.test() function:

R

# Known population mean
pop_mean <- 170
 
# Perform one-sample t-test
result <- t.test(heights, mu = pop_mean)

Step 4: Interpret the Result

The result contains the t-value, degrees of freedom, and p-value:

R

# Display the result
print(result)

Output:

    One Sample t-test

data:  heights
t = -0.37025, df = 14, p-value = 0.7167
alternative hypothesis: true mean is not equal to 170
95 percent confidence interval:
 168.1886 171.2781
sample estimates:
mean of x 
 169.7333

One Sample t-test: This indicates that a one-sample t-test was conducted.
Data: It shows that the data used for the t-test is the “heights” vector, which is your sample data.
t: This is the t-statistic, which measures the difference between the sample mean and the hypothesized population mean, relative to the variability within the sample. In this case, the t-statistic is approximately -0.37025.
df: This represents the degrees of freedom for the t-test, which is calculated as n – 1, where n is the number of observations in your sample. Here, df = 14, meaning there are 14 degrees of freedom.
p-value: The p-value is a key result of the t-test. It represents the probability of observing a t-statistic as extreme as the one calculated (or more extreme) if the null hypothesis were true. In this case, the p-value is 0.7167. Since it is greater than the typical significance level of 0.05, there is no strong evidence to reject the null hypothesis.
Alternative Hypothesis: This line specifies the alternative hypothesis. It states that the “true mean is not equal to 170,” indicating that you are testing whether the sample mean is different from 170.
95 percent confidence interval: The confidence interval provides a range within which the population mean is likely to fall. In this case, the 95% confidence interval is [168.1886, 171.2781]. This means that you can be 95% confident that the true population mean falls within this interval.
Sample Estimates: The last line indicates the sample estimate of the mean. The sample mean (mean of x) is approximately 169.7333.

The t-statistic is approximately -0.37025, indicating a small difference between the sample mean and the hypothesized population mean of 170.
The p-value is 0.7167, which is greater than the typical significance level (e.g., 0.05), suggesting that there is no strong evidence to reject the null hypothesis.
The 95% confidence interval provides a range within which the true population mean is likely to fall, and it is [168.1886, 171.2781].
The sample mean is approximately 169.7333.

There is insufficient evidence to suggest that the sample mean height is significantly different from the known population mean of 170 cm.

Testing if a Sample Mean Differs from a Hypothesized Population Mean

Let’s consider a scenario where you are a nutritionist, and you want to assess if a new diet program significantly changes the average weight of your clients. You hypothesize that the new diet program results in an average weight of 75 kg.

R

# Sample data
weights <- c(78, 80, 76, 73, 79, 74, 76, 81, 77, 75, 82, 79, 75, 80, 78, 84, 79, 
             82, 76, 77)
 
# Hypothesized population mean
hypo_mean <- 75
 
# Perform one-sample t-test
result <- t.test(weights, mu = hypo_mean)
 
# Display the result
print(result)

Output:

    One Sample t-test
data:  weights
t = 4.6865, df = 19, p-value = 0.0001608
alternative hypothesis: true mean is not equal to 75
95 percent confidence interval:
 76.68784 79.41216
sample estimates:
mean of x 
    78.05

t is the t-test static value (t = 4.6865)
df is the degree of freedom (df = 19)
p-value is the significance level of t-test( p-value = 0.0001608)
conf.int is the confidence interval of the mean at 95% (conf.int = [76.68784, 79.41216])
sample estimates is the mean value of the sample.

Since the p-value is less than the common significance level (0.05), you can reject the null hypothesis.

There is enough evidence to suggest that the sample mean weight is significantly different from the hypothesized population mean of 75 kg.

Conclusion

In conclusion, the one-sample t-test is a valuable statistical tool for comparing sample means to population means or hypothesized values. Understanding the mathematical concept, visualizing the t-distribution, and working through various examples in R will empower you to effectively use and interpret one-sample t-tests in your research and data analysis endeavors.

Suggest improvement

Independent Sample t Test in R

Share your thoughts in the comments