Related Articles

# Two-Proportions Z-Test in R Programming

• Last Updated : 16 Jul, 2020

The two-proportions z-test is used to compare two observed proportions. For example, let there are two groups of individuals:

• Group A with lung cancer: n = 500
• Group B, healthy individuals: n = 500

The number of smokers in each group is as follow:

• Group A with lung cancer: n = 500, 490 smokers, pA = 490/500 = 98
• Group B, healthy individuals: n = 500, 400 smokers, pB = 400/500 = 80

In this setting:

• The overall proportion of smokers is p = frac(490+400) 500 + 500 = 89
• The overall proportion of non-smokers is q = 1 – p = 11

So we want to know, whether the proportions of smokers are the same in the two groups of individuals?

#### The Formula for Two-Proportion Z-Test

The test statistic (also known as z-test) can be calculated as follow: where,

pA: the proportion observed in group A with size nA
pB: the proportion observed in group B with size nB
p and q: the overall proportions

#### Implementation in R

In R Language, the function used for performing a z-test is prop.test().

Syntax:
prop.test(x, n, p = NULL, alternative = “two.sided”, correct = TRUE)

Parameters:
x = number of successes and failures in data set.
n = size of data set.
p = probabilities of success. It must be in the range of 0 to 1.
alternative = a character string specifying the alternative hypothesis.
correct = a logical indicating whether Yates’ continuity correction should be applied where possible.

Example 1:
Let’s say we have two groups of student A and B. Group A with an early morning class of 400 students with 342 female students. Group B with a late class of 400 students with 290 female students. Use a 5% alpha level. We want to know, whether the proportions of females are the same in the two groups of the student? Here let’s use prop.test().

 # prop Test in R prop.test(x = c(342, 290),          n = c(400, 400))

Output:

       2-sample test for equality of proportions with continuity correction
data:  c(342, 290) out of c(400, 400)
X-squared = 19.598, df = 1, p-value = 9.559e-06
alternative hypothesis: two.sided
95 percent confidence interval:
0.07177443 0.18822557
sample estimates:
prop 1 prop 2
0.855   0.725

• It returns a p-value
• alternative hypothesis
• a 95% confidence intervals
• a probability of success

Thus as the result The p value of the test is 9.558674e-06 is greater than significance level of alpha. which is 0.05. That means there is no difference between Two Proportions. Now if you want to test whether the observed proportion of Females in group A(pA) is less than the observed proportion of Females in group B(pB), then the command is:

 # prop Test in R prop.test(x = c(342, 290),           n = c(400, 400),           alternative = "less")

Output:

2-sample test for equality of proportions with continuity correction

data:  c(342, 290) out of c(400, 400)
X-squared = 19.598, df = 1, p-value = 1
alternative hypothesis: less
95 percent confidence interval:
-1.0000000  0.1792664
sample estimates:
prop 1 prop 2
0.855  0.725


If you want to test whether the observed proportion of Females in group A(pA) is greater than the observed proportion of Females in group(pB), then the command is:

 # prop Test in R prop.test(x = c(342, 290),           n = c(400, 400),           alternative = "greater")

Output:

2-sample test for equality of proportions with continuity correction

data:  c(342, 290) out of c(400, 400)
X-squared = 19.598, df = 1, p-value = 4.779e-06
alternative hypothesis: greater
95 percent confidence interval:
0.08073363 1.00000000
sample estimates:
prop 1 prop 2
0.855  0.725


Example 2:
ABC company manufactures tablets. For quality control, two sets of tablets were tested. In the first group, 32 out of 700 were found to contain some sort of defect. In the second group, 30 out of 400 were found to contain some sort of defect. Is the difference between the two groups significant? Use a 5% alpha level. Here let’s use prop.test().

 # prop Test in R prop.test(x = c(32, 30),           n = c(700, 400))

Output:

       2-sample test for equality of proportions with continuity correction
data:  c(32, 30) out of c(700, 400)
X-squared =3.5725, df = 1, p-value = 0.05874
alternative hypothesis: two.sided
95 percent confidence interval:
-0.061344109 0.002772681
sample estimates:
prop 1      prop 2
0.04571429  0.07500000

• It returns a p-value
• alternative hypothesis
• a 95% confidence intervals
• a probability of success

Thus as the result The p value of the test is 0.0587449 is greater than significance level of alpha, which is 0.05. That means there is not significance difference between Two Proportions. Now if you want to test whether the observed proportion of defect in group one is less than the observed proportion of defect in group two, then the command is:

 # prop Test in R prop.test(x = c(32, 30),           n = c(700, 400),           alternative = "less")

Output:

2-sample test for equality of proportions with continuity correction

data:  c(32, 30) out of c(700, 400)
X-squared = 3.5725, df = 1, p-value = 0.02937
alternative hypothesis: less
95 percent confidence interval:
-1.000000000 -0.002065656
sample estimates:
prop 1     prop 2
0.04571429 0.07500000


If you want to test whether the observed proportion of defects in group one is greater than the observed proportion of defects in group two, then the command is:

 # prop.test() in Rprop.test(x = c(32, 30),           n = c(700, 400),           alternative = "greater")

Output:

2-sample test for equality of proportions with continuity correction

data:  c(32, 30) out of c(700, 400)
X-squared = 3.5725, df = 1, p-value = 0.9706
alternative hypothesis: greater
95 percent confidence interval:
-0.05650577  1.00000000
sample estimates:
prop 1     prop 2
0.04571429 0.07500000


My Personal Notes arrow_drop_up