Upper Tail Test of Population Mean with Known Variance in R

A statistical hypothesis test is a method of statistical inference used to decide whether the data at hand sufficiently support a particular hypothesis.

The conventional steps that are followed while formulating the hypothesis test, are listed as follows

State null hypothesis (Ho) and alternate hypothesis (Ha)
Collect a relevant sample of data to test the hypothesis.
Choose significance level for the hypothesis test.
Perform an appropriate statistical test.
Based on test statistics and p-value decide whether to reject or fail to reject your null hypothesis.

Conventionally, In an upper-tail test, the null hypothesis states that the true population mean (μo) is lesser than the hypothesized mean value (μ). We fail to reject the null hypothesis if the test statistic is lesser than the critical value at the chosen significance level. In this article let us discuss how to conduct an upper-tail test of the population mean with known variance.

Here the assumption is the population variance σ2 is known. From Central Limit Theorem (CLT), the population. the sample means of all possible samples of a population approximately follow a normal distribution. Let us define the test statistic based on CLT as follows

If z >= −zα, where zα is the 100(1 − α) percentile of the standard normal distribution, we will have to reject the null hypothesis.

Let us try to understand the Upper tail test by considering a case study.

Assume the data labeling company states that there are less than 2 errors in marked labels on any single page. Assume the actual mean amount of error per page 2.12, and the population standard deviation is 0.2. At the .05 significance level, can we reject the null hypothesis that the mean data labeling error per page is greater than 2 errors?

Example:

Let us start by computing the standard error of the mean as shown

sample_mean = 2.12 

# hypothesized mean value 
m0 = 2  

pop_std_dev = 0.25 
sample_size = 40 

# test statistic 

z = (xbar-mu0)/(sigma/sqrt(n))  
z

Output:

3.035786553

Then compute the upper bound of sample means for which the null hypothesis μo ≤ 2 would not be rejected.

alpha = .05  

# critical value 

z.alpha = qnorm(1-alpha)  
z.alpha

Output:

1.644

The test statistic 3.03 is greater than the critical value of 1.6449. Hence, at a .05 significance level, we reject the null hypothesis that the mean labeling error is not less than 2 per page.

Article Tags :

R Language

R-Statistics