Prerequisite : Correlation Coefficinet
Given two arrays X and Y. Find Spearman’s Rank Correlation. In Spearman rank correlation instead of working with the data values themselves (as discussed in Correlation coefficient), it work with the ranks of these values. The observations are first ranked and then these ranks are used in correlation. The Algorithm for this correlation is as follows
Rank each observation in X and store it in Rank_X Rank each observation in Y and store it in Rank_Y Obtain Pearson Correlation Coefficient for Rank_X and Rank_Y
The formula used to calculate Pearson’s Correlation Coefficient (r or rho) of sets X and Y is as follows:
Algorithm for calculating Pearson’s Coefficient of Sets X and Y
function correlationCoefficient(X, Y) n = X.size sigma_x = sigma_y = sigma_xy = 0 sigma_xsq = sigma_ysq = 0 for i in 0...N-1 sigma_x = sigma_x + X[i] sigma_y = sigma_y + Y[i] sigma_xy = sigma_xy + X[i] * Y[i] sigma_xsq = sigma_xsq + X[i] * X[i] sigma_ysq = sigma_ysq + Y[i] * Y[i] num =( n * sigma_xy - sigma_x * sigma_y) den = sqrt( [n*sigma_xsq - (sigma_x)^ 2]*[ n*sigma_ysq - (sigma_y) ^ 2] ) return num/den
While assigning ranks, it may encounter ties i.e two or more observations having the same rank. To resolve ties, this will use fractional ranking scheme. In this scheme, if n observations have the same rank then each observation gets a fractional rank given by:
fractional_rank = (rank) + (n-1)/2
The next rank that gets assigned is rank + n and not rank + 1. For instance, if the 3 items have same rank r, then each gets fractional_rank as given above. The next rank that can be given to another observation is r + 3. Note that fractional ranks need not be fractions. They are the arithmetic mean of n consecutive ranks ex r, r + 1, r + 2 … r + n-1.
(r + r+1 + r+2 + ... + r+n-1) / n = r + (n-1)/2
Some Examples :
Input : X = [15 18 19 20 21] Y = [25 26 28 27 29] Solution : Rank_X = [1 2 3 4 5] Rank_Y = [1 2 4 3 5 ] sigma_x = 1+2+3+4+5 = 15 sigma_y = 1+2+4+3+5 = 15 sigma_xy = 1*2+2*2+3*4+4*3+5*5 = 54 sigma_xsq = 1*1+2*2+3*3+4*4+5*5 = 55 sigma_ysq = 1*1+2*2+3*3+4*4+5*5 = 55 Substitute values in formula Coefficient = Pearson(Rank_X, Rank_Y) = 0.9 Input: X = [15 18 21 15 21 ] Y = [25 25 27 27 27 ] Solution: Rank_X = [1.5 3 4.5 1.5 4.5] Rank_Y = [1.5 1.5 4 4 4] Calculate and substitute values of sigma_x, sigma_y, sigma_xy, sigma_xsq, sigma_ysq. Coefficient = Pearson(Rank_X, Rank_Y) = 0.456435
The Algorithm for fractional ranking scheme is given below
function rankify(X) N = X.size() // Vector to store ranks Rank_X(N) for i = 0 ... N-1 r = 1 and s = 1 // Count no of smaller elements in 0...i-1 for j = 0...i-1 if X[j] < X[i] r = r+1 if X[j] == X[i] s = s+1 // Count no of smaller elements in i+1...N-1 for j = i+1...N-1 if X[j] < X[i] r = r+1 if X[j] == X[i] s = s+1 //Assign Fractional Rank Rank_X[i] = r + (s-1) * 0.5 return Rank_X
There is a direct formula to calculate Spearman’s coefficient given by However we need to put in a correction term to resolve each tie and hence this formula has not been discussed. Calculating Spearman’s coefficient from the correlation coefficient of ranks is the most general method.
A CPP Program to evaluate Spearman’s coefficient is given below
Vector X 15 18 21 15 21 Rankings of X 1.5 3 4.5 1.5 4.5 Vector Y 25 25 27 27 27 Rankings of Y 1.5 1.5 4 4 4 Spearman's Rank correlation: 0.456435
Python code to calculate Spearman’s Rank Correlation:
Rankings of X: 0 1.5 1 3.0 2 4.5 3 1.5 4 4.5 dtype: float64 Rankings of Y: 0 1.5 1 1.5 2 4.0 3 4.0 4 4.0 dtype: float64 Spearman's Rank correlation: 0.456435464588
Python code to calculate Spearman’s Correlation using Scipy
There is one simple way to directly get the spearman’s correlation value using scipy.
Attention reader! Don’t stop learning now. Get hold of all the important DSA concepts with the DSA Self Paced Course at a student-friendly price and become industry ready.
- Program to find correlation coefficient
- Program for Rank of Matrix
- Program to calculate Percentile of a student based on rank
- Rank of all elements in an array
- Replace each element of Array with it's corresponding rank
- Program to add two polynomials
- C program to calculate the value of nPr
- Program to find value of 1^k + 2^k + 3^k + ... + n^k
- Program to find sum of 1 + x/2! + x^2/3! +...+x^n/(n+1)!
- Program to add two fractions
- Program to compare m^n and n^m
- Program for n-th even number
- Program for n-th odd number
- Program to calculate value of nCr
- Program for sum of cos(x) series
- Program to compute log a to any base b (logb a)
- Program for Simpson's 1/3 Rule
- Program to find sum of series 1 + 2 + 2 + 3 + 3 + 3 + . . . + n
- Program for Mobius Function
- Program to add two integers of given base
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to firstname.lastname@example.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.
Improved By : mkumarchaudhary06