Open In App

# Two-Proportions Z-Test in R Programming

A two-proportion z-test allows us to compare two proportions to see if they are the same.

It calculates the range of values that is likely to include the difference between the population proportions.

The z-test is based on a standard normal distribution. It has a critical value i.e. 1.96. for 5% two-tailed.

In R, the function used for performing a z-test is prop.test().

Syntax:

prop.test(x, n, p = NULL, alternative = c(“two.sided”, “less”, “greater”),

correct = TRUE)

Parameters:

x = number  of successes and failures in data set.

n = size of data set.

p = probabilities of success. It must be in the range of 0 to 1.

alternative = a character string specifying the alternative hypothesis.

correct = a logical indicating whether Yates’ continuity correction should be applied where possible.

There are two types of hypotheses:

The null hypothesis H0  for the test is that the proportions are the same. Conditions are as follows.

H_{0}: p_{A}=p_{B}

The alternate hypothesis Ha is that the proportions are not the same. Conditions are as follow.

H_{a}: p_{A} \neq p_{B} \text { (different) }

H_{a}: p_{A}>p_{B} \text { (greater) }

H_{a}: p_{A}<p_{B} \text { (less) }

The two-proportions z-test is used to compare two observed proportions. For example, let there be two groups of individuals:

• Group A with lung cancer: n = 500
• Group B, healthy individuals: n = 500

The number of smokers in each group is as follows:

• Group A with lung cancer: n = 500, 490 smokers, pA = 490/500 = 98
• Group B, healthy individuals: n = 500, 400 smokers, pB = 400/500 = 80

In this setting:

• The overall proportion of smokers is p = frac(490+400) 500 + 500 = 89
• The overall proportion of non-smokers is q = 1 – p = 11

So we want to know, whether the proportions of smokers are the same in the two groups of individuals.

#### The Formula for Two-Proportion Z-Test

The test statistic (also known as z-test) can be calculated as follow:

where, pA: the proportion observed in group A with size nA pB: the proportion observed in group B with size nB p and q: the overall proportions

### Example 1

Let’s say we have two groups of students A and B. Group A with an early morning class of 400 students with 342 female students. Group B with a late class of 400 students with 290 female students. Use a 5% alpha level. We want to know, whether the proportions of females are the same in the two groups of the student. Here let’s use prop.test()

## r

 # prop Test in Rprop.test(x = c(342, 290),          n = c(400, 400))

Output:

       2-sample test for equality of proportions with continuity correctiondata:  c(342, 290) out of c(400, 400)X-squared = 19.598, df = 1, p-value = 9.559e-06alternative hypothesis: two.sided95 percent confidence interval:0.07177443 0.18822557sample estimates:prop 1 prop 2  0.855   0.725
• It returns a p-value
• alternative hypothesis
• a 95% confidence intervals
• a probability of success

Thus, as a result, The p-value of the test is 9.558674e-06 is greater than the significance level of alpha. which is 0.05. That means there is no difference between the Two Proportions. Now if you want to test whether the observed proportion of Females in group A(pA) is less than the observed proportion of Females in group B(pB), then the command is:

## r

 # prop Test in Rprop.test(x = c(342, 290),        n = c(400, 400),        alternative = "less")

Output:

2-sample test for equality of proportions with continuity correctiondata:  c(342, 290) out of c(400, 400)X-squared = 19.598, df = 1, p-value = 1alternative hypothesis: less95 percent confidence interval: -1.0000000  0.1792664sample estimates:prop 1 prop 2  0.855  0.725

If we want to test whether the observed proportion of Females in group A(pA) is greater than the observed proportion of Females in group(pB), then the command is:

## r

 # prop Test in Rprop.test(x = c(342, 290),        n = c(400, 400),        alternative = "greater")

Output:

2-sample test for equality of proportions with continuity correctiondata:  c(342, 290) out of c(400, 400)X-squared = 19.598, df = 1, p-value = 4.779e-06alternative hypothesis: greater95 percent confidence interval: 0.08073363 1.00000000sample estimates:prop 1 prop 2  0.855  0.725

### Example 2

ABC company manufactures tablets. For quality control, two sets of tablets were tested. In the first group, 32 out of 700 were found to contain some sort of defect. In the second group, 30 out of 400 were found to contain some sort of defect. Is the difference between the two groups significant? Use a 5% alpha level. Here let’s use prop.test()

## r

 # prop Test in Rprop.test(x = c(32, 30),          n = c(700, 400))

Output:

       2-sample test for equality of proportions with continuity correctiondata:  c(32, 30) out of c(700, 400)X-squared =3.5725, df = 1, p-value = 0.05874alternative hypothesis: two.sided95 percent confidence interval:-0.061344109 0.002772681sample estimates: prop 1      prop 2  0.04571429  0.07500000
• It returns a p-value
• alternative hypothesis
• a 95% confidence intervals
• a probability of success

Thus as a result The p-value of the test is 0.0587449 is greater than the significance level of alpha, which is 0.05. That means there is no significant difference between the Two Proportions. Now if you want to test whether the observed proportion of defect in group one is less than the observed proportion of defect in group two, then the command is:

## r

 # prop Test in Rprop.test(x = c(342, 290),        n = c(400, 400),        alternative = "less")

Output:

    2-sample test for equality of proportions with continuity correctiondata:  c(342, 290) out of c(400, 400)X-squared = 19.598, df = 1, p-value = 4.779e-06alternative hypothesis: greater95 percent confidence interval: 0.08073363 1.00000000sample estimates:prop 1 prop 2  0.855  0.725

If we want to test whether the observed proportion of defects in group one is greater than the observed proportion of defects in group two, then the command is:

## r

 # prop Test in Rprop.test(x = c(342, 290),        n = c(400, 400),        alternative = "greater")

Output:

    2-sample test for equality of proportions with continuity correctiondata:  c(342, 290) out of c(400, 400)X-squared = 19.598, df = 1, p-value = 4.779e-06alternative hypothesis: greater95 percent confidence interval: 0.08073363 1.00000000sample estimates:prop 1 prop 2  0.855  0.725