Python | Kendall Rank Correlation Coefficient
Last Updated :
20 Jul, 2021
What is correlation test?
The strength of the association between two variables is known as the correlation test. For instance, if we are interested to know whether there is a relationship between the heights of fathers and sons, a correlation coefficient can be calculated to answer this question.
For know more about correlation please refer
this.
Methods for correlation analysis:
There are mainly two types of correlation:
- Parametric Correlation – Pearson correlation(r) : It measures a linear dependence between two variables (x and y) is known as a parametric correlation test because it depends on the distribution of the data.
- Non-Parametric Correlation – Kendall(tau) and Spearman(rho): They are rank-based correlation coefficients, are known as non-parametric correlation.
Kendall Rank Correlation Coefficient formula:
where,
- Concordant Pair: A pair of observations (x1, y1) and (x2, y2) that follows the property
- x1 > x2 and y1 > y2 or
- x1 < x2 and y1 < y2
- Discordant Pair: A pair of observations (x1, y1) and (x2, y2) that follows the property
- x1 > x2 and y1 < y2 or
- x1 < x2 and y1 > y2
- n: Total number of samples
Note: The pair for which
x1 = x2 and
y1 = y2 are not classified as concordant or discordant and are ignored.
Example: Let’s consider two experts ranking on food items in the below table.
Items |
Expert 1 |
Expert 2 |
1 |
1 |
1 |
2 |
2 |
3 |
3 |
3 |
6 |
4 |
4 |
2 |
5 |
5 |
7 |
6 |
6 |
4 |
7 |
7 |
5 |
The table says that for item-1, expert-1 gives rank-1 whereas expert-2 gives also rank-1. Similarly for item-2, expert-1 gives rank-2 whereas expert-2 gives rank-3 and so on.
Step1:
At first, according to the formula, we have to find the number of concordant pairs and the number of discordant pairs. So take a look at item-1 and item-2 rows. Let for expert-1,
x1 = 1 and
x2 = 2. Similarly for expert-2,
y1 = 1 and
y2 = 3. So the condition
x1 < x2 and
y1 < y2 satisfies and we can say item-1 and item-2 rows are concordant pairs.
Similarly take a look at item-2 and item-4 rows. Let for expert-1,
x1 = 2 and
x2 = 4. Similarly for expert-2,
y1 = 3 and
y2 = 2. So the condition
x1 < x2 and
y1 > y2 satisfies and we can say item-2 and item-4 rows are discordant pairs.
Like that, by comparing each row you can calculate the number of concordant and discordant pairs. The complete solution is given in the below table.
1 |
|
|
|
|
|
|
|
2 |
C |
|
|
|
|
|
|
3 |
C |
C |
|
|
|
|
|
4 |
C |
D |
D |
|
|
|
|
5 |
C |
C |
C |
C |
|
|
|
6 |
C |
C |
C |
D |
D |
|
|
7 |
C |
C |
C |
C |
D |
D |
|
|
1 |
2 |
3 |
4 |
5 |
6 |
7 |
Step 2:
So from the above table, we found that,
The number of concordant pairs is: 15
The number of discordant pairs is: 6
The total number of samples/items is: 7
Hence by applying the Kendall Rank Correlation Coefficient formula
tau = (15 – 6) / 21 = 0.42857
This result says that if it’s basically high then there is a broad agreement between the two experts. Otherwise, if the expert-1 completely disagrees with expert-2 you might get even negative values.
kendalltau() : Python functions to compute Kendall Rank Correlation Coefficient in Python
Syntax:
kendalltau(x, y)
- x, y: Numeric lists with the same length
Code: Python program to illustrate Kendall Rank correlation
Python
from scipy.stats import kendalltau
X = [ 1 , 2 , 3 , 4 , 5 , 6 , 7 ]
Y = [ 1 , 3 , 6 , 2 , 7 , 4 , 5 ]
corr, _ = kendalltau(X, Y)
print ( 'Kendall Rank correlation: %.5f' % corr)
|
Output:
Kendall Rank correlation: 0.42857
Like Article
Suggest improvement
Share your thoughts in the comments
Please Login to comment...