The Wilcoxon signed-rank test is a non-parametric statistical hypothesis test used to compare two related samples, matched samples, or repeated measurements on a single sample to estimate whether their population means ranks differ e.g. it is a paired difference test. It can be applied as an alternative to the paired Student’s t-test also known as “t-test for matched pairs” or “t-test for dependent samples” when the distribution of the difference between the two samples’ means cannot be assumed to be normally distributed. A Wilcoxon signed-rank test is a nonparametric test that can be used to determine whether two dependent samples were selected from populations having the same distribution.
Wilcoxon Signed-Rank Test in R
This test can be divided into two parts:
- One-Sample Wilcoxon Signed Rank Test
- Paired Samples Wilcoxon Test
One-Sample Wilcoxon Signed Rank Test
The one-sample Wilcoxon signed-rank test is a non-parametric alternative to a one-sample t-test when the data cannot be assumed to be normally distributed. It’s used to determine whether the median of the sample is equal to a known standard value i.e. a theoretical value. In R Language one can perform this test very easily.
Implementation in R
To perform a one-sample Wilcoxon test, R provides a function wilcox.test() that can be used as follow:
Syntax: wilcox.test(x, mu = 0, alternative = “two.sided”)
Parameters:
- x: a numeric vector containing your data values
- mu: the theoretical mean/median value. Default is 0 but you can change it.
- alternative: the alternative hypothesis. Allowed value is one of “two.sided” (default), “greater” or “less”.
Example: Here, let’s use an example data set containing the weight of 10 rabbits. Let’s know if the median weight of the rabbit differs from 25g?
R
set.seed (1234)
myData = data.frame (
name = paste0 ( rep ( "R_" , 10), 1:10),
weight = round ( rnorm (10, 30, 2), 1)
)
print (myData)
result = wilcox.test (myData$weight, mu = 25)
print (result)
|
Output:
name weight
1 R_1 27.6
2 R_2 30.6
3 R_3 32.2
4 R_4 25.3
5 R_5 30.9
6 R_6 31.0
7 R_7 28.9
8 R_8 28.9
9 R_9 28.9
10 R_10 28.2
Wilcoxon signed rank test with continuity correction
data: myData$weight
V = 55, p-value = 0.005793
alternative hypothesis: true location is not equal to 25
In the above output, the p-value of the test is 0.005793, which is less than the significance level alpha = 0.05. So we can reject the null hypothesis and conclude that the average weight of the rabbit is significantly different from 25g with a p-value = 0.005793.
If one wants to test whether the median weight of the rabbit is less than 25g (one-tailed test), then the code will be:
R
set.seed (1234)
myData = data.frame (
name = paste0 ( rep ( "R_" , 10), 1:10),
weight = round ( rnorm (10, 30, 2), 1)
)
wilcox.test (myData$weight, mu = 25,
alternative = "less" )
print (result)
|
Output:
Wilcoxon signed rank exact test
data: myData$weight
V = 55, p-value = 1
alternative hypothesis: true location is less than 25
Or, If one wants to test whether the median weight of the rabbit is greater than 25g (one-tailed test), then the code will be:
R
set.seed (1234)
myData = data.frame (
name = paste0 ( rep ( "R_" , 10), 1:10),
weight = round ( rnorm (10, 30, 2), 1)
)
wilcox.test (myData$weight, mu = 25,
alternative = "greater" )
print (result)
|
Output:
Wilcoxon signed rank exact test
data: myData$weight
V = 55, p-value = 1
alternative hypothesis: true location is less than 25
Paired Samples Wilcoxon Test in R
The paired samples Wilcoxon test is a non-parametric alternative to paired t-test used to compare paired data. It’s used when data are not normally distributed.
Implementation in R
To perform Paired Samples Wilcoxon-test, the R provides a function wilcox.test() that can be used as follow:
Syntax: wilcox.test(x, y, paired = TRUE, alternative = “two.sided”)
Parameters:
- x, y: numeric vectors
- paired: a logical value specifying that we want to compute a paired Wilcoxon test
- alternative: the alternative hypothesis. Allowed value is one of “two.sided” (default), “greater” or “less”.
Example: Here, let’s use an example data set, which contains the weight of 10 rabbits before and after the treatment. We want to know, if there is any significant difference in the median weights before and after treatment?
R
before <- c (190.1, 190.9, 172.7, 213, 231.4,
196.9, 172.2, 285.5, 225.2, 113.7)
after <- c (392.9, 313.2, 345.1, 393, 434,
227.9, 422, 383.9, 392.3, 352.2)
myData <- data.frame (
group = rep ( c ( "before" , "after" ), each = 10),
weight = c (before, after)
)
print (myData)
result = wilcox.test (before, after, paired = TRUE )
print (result)
|
Output:
group weight
1 before 190.1
2 before 190.9
3 before 172.7
4 before 213.0
5 before 231.4
6 before 196.9
7 before 172.2
8 before 285.5
9 before 225.2
10 before 113.7
11 after 392.9
12 after 313.2
13 after 345.1
14 after 393.0
15 after 434.0
16 after 227.9
17 after 422.0
18 after 383.9
19 after 392.3
20 after 352.2
Wilcoxon signed rank test
data: before and after
V = 0, p-value = 0.001953
alternative hypothesis: true location shift is not equal to 0
In the above output, the p-value of the test is 0.001953, which is less than the significance level alpha = 0.05. We can conclude that the median weight of the mice before treatment is significantly different from the median weight after treatment with a p-value = 0.001953.
If one wants to test whether the median weight before treatment is less than the median weight after treatment, then the code will be:
R
before <- c (190.1, 190.9, 172.7, 213, 231.4,
196.9, 172.2, 285.5, 225.2, 113.7)
after <- c (392.9, 313.2, 345.1, 393, 434,
227.9, 422, 383.9, 392.3, 352.2)
myData <- data.frame (
group = rep ( c ( "before" , "after" ), each = 10),
weight = c (before, after)
)
result = wilcox.test (weight ~ group,
data = myData,
paired = TRUE ,
alternative = "less" )
print (result)
|
Output:
Wilcoxon signed rank test
data: weight by group
V = 55, p-value = 1
alternative hypothesis: true location shift is less than 0
Or, If one wants to test whether the median weight before treatment is greater than the median weight after treatment, then the code will be:
R
before <- c (190.1, 190.9, 172.7, 213, 231.4,
196.9, 172.2, 285.5, 225.2, 113.7)
after <- c (392.9, 313.2, 345.1, 393, 434,
227.9, 422, 383.9, 392.3, 352.2)
myData <- data.frame (
group = rep ( c ( "before" , "after" ), each = 10),
weight = c (before, after)
)
result = wilcox.test (weight ~ group,
data = myData,
paired = TRUE ,
alternative = "greater" )
print (result)
|
Output:
Wilcoxon signed rank test
data: weight by group
V = 55, p-value = 0.0009766
alternative hypothesis: true location shift is greater than 0
Whether you're preparing for your first job interview or aiming to upskill in this ever-evolving tech landscape,
GeeksforGeeks Courses are your key to success. We provide top-quality content at affordable prices, all geared towards accelerating your growth in a time-bound manner. Join the millions we've already empowered, and we're here to do the same for you. Don't miss out -
check it out now!
Last Updated :
05 Jul, 2023
Like Article
Save Article