# Program for Spearman’s Rank Correlation

• Difficulty Level : Medium
• Last Updated : 03 Aug, 2022

Prerequisite : Correlation Coefficient
Given two arrays X[] and Y[]. Find Spearman’s Rank Correlation. In Spearman rank correlation instead of working with the data values themselves (as discussed in Correlation coefficient), it work with the ranks of these values. The observations are first ranked and then these ranks are used in correlation. The Algorithm for this correlation is as follows

Rank each observation in X and store it in Rank_X
Rank each observation in Y and store it in Rank_Y
Obtain Pearson Correlation Coefficient for Rank_X and Rank_Y

The formula used to calculate Pearson’s Correlation Coefficient (r or rho) of sets X and Y is as follows: Algorithm for calculating Pearson’s Coefficient of Sets X and Y

function correlationCoefficient(X, Y)
n = X.size
sigma_x = sigma_y = sigma_xy = 0
sigma_xsq = sigma_ysq = 0
for i in 0...N-1
sigma_x = sigma_x + X[i]
sigma_y = sigma_y + Y[i]
sigma_xy = sigma_xy + X[i] * Y[i]
sigma_xsq = sigma_xsq + X[i] * X[i]
sigma_ysq = sigma_ysq + Y[i] * Y[i]

num =( n * sigma_xy - sigma_x * sigma_y)
den = sqrt( [n*sigma_xsq - (sigma_x)^ 2]*[ n*sigma_ysq - (sigma_y) ^ 2] )
return num/den

While assigning ranks, it may encounter ties i.e two or more observations having the same rank. To resolve ties, this will use fractional ranking scheme. In this scheme, if n observations have the same rank then each observation gets a fractional rank given by:

fractional_rank = (rank) + (n-1)/2

The next rank that gets assigned is rank + n and not rank + 1. For instance, if the 3 items have same rank r, then each gets fractional_rank as given above. The next rank that can be given to another observation is r + 3. Note that fractional ranks need not be fractions. They are the arithmetic mean of n consecutive ranks ex r, r + 1, r + 2 … r + n-1.

(r + r+1 + r+2 + ... + r+n-1) / n = r + (n-1)/2

Some Examples :

Input :    X = [15 18 19 20 21]
Y = [25 26 28 27 29]
Solution : Rank_X = [1 2 3 4 5]
Rank_Y = [1 2 4 3 5 ]
sigma_x = 1+2+3+4+5 = 15
sigma_y = 1+2+4+3+5 = 15
sigma_xy = 1*2+2*2+3*4+4*3+5*5 = 54
sigma_xsq = 1*1+2*2+3*3+4*4+5*5 = 55
sigma_ysq = 1*1+2*2+3*3+4*4+5*5 = 55
Substitute values in formula
Coefficient = Pearson(Rank_X, Rank_Y) = 0.9

Input:    X = [15 18 21 15 21 ]
Y = [25 25 27 27 27 ]
Solution: Rank_X = [1.5  3 4.5 1.5 4.5]
Rank_Y = [1.5  1.5 4 4 4]
Calculate and substitute values of sigma_x, sigma_y,
sigma_xy, sigma_xsq, sigma_ysq.
Coefficient = Pearson(Rank_X, Rank_Y) = 0.456435

The Algorithm for fractional ranking scheme is given below:

function rankify(X)
N = X.size()

// Vector to store ranks
Rank_X(N)
for i = 0 ... N-1
r = 1 and s = 1

// Count no of smaller elements in 0...i-1
for j = 0...i-1
if X[j] < X[i]
r = r+1
if X[j] == X[i]
s = s+1

// Count no of smaller elements in i+1...N-1
for j = i+1...N-1
if X[j] < X[i]
r = r+1
if X[j] == X[i]
s = s+1

//Assign Fractional Rank
Rank_X[i] = r + (s-1) * 0.5

return Rank_X 

Note:
There is a direct formula to calculate Spearman’s coefficient given by However we need to put in a correction term to resolve each tie and hence this formula has not been discussed. Calculating Spearman’s coefficient from the correlation coefficient of ranks is the most general method.

A CPP Program to evaluate Spearman’s coefficient is given below:

## C++

 // Program to find correlation// coefficient#include #include #include using namespace std; typedef vector<float> Vector; // Utility Function to print// a Vectorvoid printVector(const Vector &X){    for (auto i: X)        cout << i << " ";         cout << endl;} // Function returns the rank vector// of the set of observationsVector rankify(Vector & X) {     int N = X.size();     // Rank Vector    Vector Rank_X(N);         for(int i = 0; i < N; i++)    {        int r = 1, s = 1;                 // Count no of smaller elements        // in 0 to i-1        for(int j = 0; j < i; j++) {            if (X[j] < X[i] ) r++;            if (X[j] == X[i] ) s++;        }             // Count no of smaller elements        // in i+1 to N-1        for (int j = i+1; j < N; j++) {            if (X[j] < X[i] ) r++;            if (X[j] == X[i] ) s++;        }         // Use Fractional Rank formula        // fractional_rank = r + (n-1)/2        Rank_X[i] = r + (s-1) * 0.5;           }         // Return Rank Vector    return Rank_X;} // function that returns// Pearson correlation coefficient.float correlationCoefficient        (Vector &X, Vector &Y){    int n = X.size();    float sum_X = 0, sum_Y = 0,                    sum_XY = 0;    float squareSum_X = 0,        squareSum_Y = 0;     for (int i = 0; i < n; i++)    {        // sum of elements of array X.        sum_X = sum_X + X[i];         // sum of elements of array Y.        sum_Y = sum_Y + Y[i];         // sum of X[i] * Y[i].        sum_XY = sum_XY + X[i] * Y[i];         // sum of square of array elements.        squareSum_X = squareSum_X +                      X[i] * X[i];        squareSum_Y = squareSum_Y +                      Y[i] * Y[i];    }     // use formula for calculating    // correlation coefficient.    float corr = (float)(n * sum_XY -                  sum_X * sum_Y) /                  sqrt((n * squareSum_X -                       sum_X * sum_X) *                       (n * squareSum_Y -                       sum_Y * sum_Y));     return corr;} // Driver functionint main(){     Vector X = {15,18,21, 15, 21};    Vector Y= {25,25,27,27,27};     // Get ranks of vector X    Vector rank_x = rankify(X);     // Get ranks of vector y    Vector rank_y = rankify(Y);         cout << "Vector X" << endl;    printVector(X);     // Print rank vector of X    cout << "Rankings of X" << endl;    printVector(rank_x);         // Print Vector Y    cout << "Vector Y" << endl;    printVector(Y);     // Print rank vector of Y    cout << "Rankings of Y" << endl;    printVector(rank_y);     // Print Spearmans coefficient    cout << "Spearman's Rank correlation: "                                << endl;    cout<

## Java

 // Java Program to find correlation// coefficientimport java.util.*; class GFG{     // Utility Function to print  // a Vector  static void printVector(ArrayList X)  {    for (double i : X)      System.out.print(i + " ");     System.out.println();  }   // Function returns the rank vector  // of the set of observations  static ArrayList rankify(ArrayList X)  {     int N = X.size();     // Rank Vector    ArrayList Rank_X = new ArrayList();     for (int i = 0; i < N; i++) {      Rank_X.add(0d);      int r = 1, s = 1;       // Count no of smaller elements      // in 0 to i-1      for (int j = 0; j < i; j++) {        if (X.get(j) < X.get(i))          r++;        if (X.get(j) == X.get(i))          s++;      }       // Count no of smaller elements      // in i+1 to N-1      for (int j = i + 1; j < N; j++) {        if (X.get(j) < X.get(i))          r++;        if (X.get(j) == X.get(i))          s++;      }       // Use Fractional Rank formula      // fractional_rank = r + (n-1)/2      Rank_X.set(i, (r + (s - 1) * 0.5));    }     // Return Rank Vector    return Rank_X;  }   // function that returns  // Pearson correlation coefficient.  static double    correlationCoefficient(ArrayList X,                           ArrayList Y)  {    int n = X.size();    double sum_X = 0, sum_Y = 0, sum_XY = 0;    double squareSum_X = 0, squareSum_Y = 0;     for (int i = 0; i < n; i++) {      // sum of elements of array X.      sum_X = sum_X + X.get(i);       // sum of elements of array Y.      sum_Y = sum_Y + Y.get(i);       // sum of X[i] * Y[i].      sum_XY = sum_XY + X.get(i) * Y.get(i);       // sum of square of array elements.      squareSum_X = squareSum_X + X.get(i) * X.get(i);      squareSum_Y = squareSum_Y + Y.get(i) * Y.get(i);    }     // use formula for calculating    // correlation coefficient.    double corr      = (n * sum_XY - sum_X * sum_Y)      / Math.sqrt(      (n * squareSum_X - sum_X * sum_X)      * (n * squareSum_Y - sum_Y * sum_Y));     return corr;  }   // Driver function  public static void main(String[] args)  {     ArrayList X = new ArrayList(      Arrays.asList(15d, 18d, 21d, 15d, 21d));    ArrayList Y = new ArrayList(      Arrays.asList(25d, 25d, 27d, 27d, 27d));     // Get ranks of vector X    ArrayList rank_x = rankify(X);     // Get ranks of vector y    ArrayList rank_y = rankify(Y);     System.out.println("Vector X");    printVector(X);     // Print rank vector of X    System.out.println("Rankings of X");    printVector(rank_x);     // Print Vector Y    System.out.println("Vector Y");    printVector(Y);     // Print rank vector of Y    System.out.println("Rankings of Y");    printVector(rank_y);     // Print Spearmans coefficient    System.out.println("Spearman's Rank correlation: ");    System.out.println(      correlationCoefficient(rank_x, rank_y));  }} // This code is contributed by phasing17

## Python3

 # Python3 Program to find correlation coefficient  # Utility Function to print# a Vectordef printVector(X):    print(*X) # Function returns the rank vector# of the set of observations  def rankify(X):     N = len(X)     # Rank Vector    Rank_X = [None for _ in range(N)]     for i in range(N):         r = 1        s = 1         # Count no of smaller elements        # in 0 to i-1        for j in range(i):            if (X[j] < X[i]):                r += 1            if (X[j] == X[i]):                s += 1         # Count no of smaller elements        # in i+1 to N-1        for j in range(i+1, N):            if (X[j] < X[i]):                r += 1            if (X[j] == X[i]):                s += 1         # Use Fractional Rank formula        # fractional_rank = r + (n-1)/2        Rank_X[i] = r + (s-1) * 0.5     # Return Rank Vector    return Rank_X  # function that returns# Pearson correlation coefficient.def correlationCoefficient(X, Y):    n = len(X)    sum_X = 0    sum_Y = 0    sum_XY = 0    squareSum_X = 0    squareSum_Y = 0     for i in range(n):         # sum of elements of array X.        sum_X = sum_X + X[i]         # sum of elements of array Y.        sum_Y = sum_Y + Y[i]         # sum of X[i] * Y[i].        sum_XY = sum_XY + X[i] * Y[i]         # sum of square of array elements.        squareSum_X = squareSum_X + X[i] * X[i]        squareSum_Y = squareSum_Y + Y[i] * Y[i]     # use formula for calculating    # correlation coefficient.    corr = (n * sum_XY - sum_X * sum_Y) / ((n * squareSum_X -                                            sum_X * sum_X) * (n * squareSum_Y - sum_Y * sum_Y)) ** 0.5     return corr  # Driver functionX = [15, 18, 21, 15, 21]Y = [25, 25, 27, 27, 27] # Get ranks of vector Xrank_x = rankify(X) # Get ranks of vector yrank_y = rankify(Y) print("Vector X")printVector(X) # Print rank vector of Xprint("Rankings of X")printVector(rank_x) # Print Vector Yprint("Vector Y")printVector(Y) # Print rank vector of Yprint("Rankings of Y")printVector(rank_y) # Print Spearmans coefficientprint("Spearman's Rank correlation: ")print(correlationCoefficient(rank_x, rank_y))  # This code is contributed by phasing17

## C#

 // Program to find correlation// coefficient using System;using System.Collections.Generic; class GFG {    // Utility Function to print    // a Vector    static void printVector(List<double> X)    {        foreach(var i in X) Console.Write(i + " ");         Console.WriteLine();    }     // Function returns the rank vector    // of the set of observations    static List<double> rankify(List<double> X)    {         int N = X.Count;         // Rank Vector        List<double> Rank_X = new List<double>();         for (int i = 0; i < N; i++) {            Rank_X.Add(0);            int r = 1, s = 1;             // Count no of smaller elements            // in 0 to i-1            for (int j = 0; j < i; j++) {                if (X[j] < X[i])                    r++;                if (X[j] == X[i])                    s++;            }             // Count no of smaller elements            // in i+1 to N-1            for (int j = i + 1; j < N; j++) {                if (X[j] < X[i])                    r++;                if (X[j] == X[i])                    s++;            }             // Use Fractional Rank formula            // fractional_rank = r + (n-1)/2            Rank_X[i] = (r + (s - 1) * 0.5);        }         // Return Rank Vector        return Rank_X;    }     // function that returns    // Pearson correlation coefficient.    static double correlationCoefficient(List<double> X,                                         List<double> Y)    {        int n = X.Count;        double sum_X = 0, sum_Y = 0, sum_XY = 0;        double squareSum_X = 0, squareSum_Y = 0;         for (int i = 0; i < n; i++) {            // sum of elements of array X.            sum_X = sum_X + X[i];             // sum of elements of array Y.            sum_Y = sum_Y + Y[i];             // sum of X[i] * Y[i].            sum_XY = sum_XY + X[i] * Y[i];             // sum of square of array elements.            squareSum_X = squareSum_X + X[i] * X[i];            squareSum_Y = squareSum_Y + Y[i] * Y[i];        }         // use formula for calculating        // correlation coefficient.        double corr            = (n * sum_XY - sum_X * sum_Y)              / Math.Sqrt(                  (n * squareSum_X - sum_X * sum_X)                  * (n * squareSum_Y - sum_Y * sum_Y));         return corr;    }     // Driver function    public static void Main(string[] args)    {         List<double> X = new List<double>(            new double[] { 15, 18, 21, 15, 21 });        List<double> Y = new List<double>(            new double[] { 25, 25, 27, 27, 27 });         // Get ranks of vector X        List<double> rank_x = rankify(X);         // Get ranks of vector y        List<double> rank_y = rankify(Y);         Console.WriteLine("Vector X");        printVector(X);         // Print rank vector of X        Console.WriteLine("Rankings of X");        printVector(rank_x);         // Print Vector Y        Console.WriteLine("Vector Y");        printVector(Y);         // Print rank vector of Y        Console.WriteLine("Rankings of Y");        printVector(rank_y);         // Print Spearmans coefficient        Console.WriteLine("Spearman's Rank correlation: ");        Console.WriteLine(            correlationCoefficient(rank_x, rank_y));    }} // This code is contributed by phasing17

## Javascript

 // Program to find correlation// coefficient  // Utility Function to print// a Vectorfunction printVector(X){    for (var i of X)        process.stdout.write(i + " ");         process.stdout.write("\n");} // Function returns the rank vector// of the set of observationsfunction rankify(X) {     let N = X.length;     // Rank Vector    let Rank_X = new Array(N);         for(var i = 0; i < N; i++)    {        var r = 1, s = 1;                 // Count no of smaller elements        // in 0 to i-1        for(var j = 0; j < i; j++) {            if (X[j] < X[i] ) r++;            if (X[j] == X[i] ) s++;        }             // Count no of smaller elements        // in i+1 to N-1        for (var j = i+1; j < N; j++) {            if (X[j] < X[i] ) r++;            if (X[j] == X[i] ) s++;        }         // Use Fractional Rank formula        // fractional_rank = r + (n-1)/2        Rank_X[i] = r + (s-1) * 0.5;           }         // Return Rank Vector    return Rank_X;} // function that returns// Pearson correlation coefficient.function correlationCoefficient        (X, Y){    let n = X.length;    let sum_X = 0, sum_Y = 0,                    sum_XY = 0;    let squareSum_X = 0,        squareSum_Y = 0;     for (var i = 0; i < n; i++)    {        // sum of elements of array X.        sum_X = sum_X + X[i];         // sum of elements of array Y.        sum_Y = sum_Y + Y[i];         // sum of X[i] * Y[i].        sum_XY = sum_XY + X[i] * Y[i];         // sum of square of array elements.        squareSum_X = squareSum_X +                      X[i] * X[i];        squareSum_Y = squareSum_Y +                      Y[i] * Y[i];    }     // use formula for calculating    // correlation coefficient.    let corr = (n * sum_XY -                  sum_X * sum_Y) /                  Math.sqrt((n * squareSum_X -                       sum_X * sum_X) *                       (n * squareSum_Y -                       sum_Y * sum_Y));     return corr;} // Driver functionlet X = [15,18,21, 15, 21];let Y= [25,25,27,27,27]; // Get ranks of vector Xlet rank_x = rankify(X); // Get ranks of vector ylet rank_y = rankify(Y);     console.log("Vector X");printVector(X); // Print rank vector of Xconsole.log("Rankings of X");printVector(rank_x);     // Print Vector Yconsole.log("Vector Y");printVector(Y); // Print rank vector of Yconsole.log("Rankings of Y");printVector(rank_y); // Print Spearmans coefficientconsole.log("Spearman's Rank correlation: ");console.log(correlationCoefficient(rank_x,                                rank_y));                                                                  // This code is contributed by phasing17

Output:

Vector X
15   18   21   15   21
Rankings of X
1.5   3   4.5   1.5   4.5
Vector Y
25   25   27   27   27
Rankings of Y
1.5   1.5   4   4   4
Spearman's Rank correlation:
0.456435

Python code to calculate Spearman’s Rank Correlation:

## Python3

 # Import pandas and scipy.statsimport pandas as pdimport scipy.stats # Two lists x and yx = [15,18,21, 15, 21]y = [25,25,27,27,27] # Create a function that takes in x's and y'sdef spearmans_rank_correlation(x, y):         # Calculate the rank of x's    xranks = pd.Series(x).rank()    print("Rankings of X:")    print(xranks)         # Calculate the ranking of the y's    yranks = pd.Series(y).rank()    print("Rankings of Y:")    print(yranks)         # Calculate Pearson's correlation coefficient on the ranked versions of the data    print("Spearman's Rank correlation:",scipy.stats.pearsonr(xranks, yranks)) # Call the functionspearmans_rank_correlation(x, y) # This code is contributed by Manish KC# profile: mkumarchaudhary06

Output:

Rankings of X:
0    1.5
1    3.0
2    4.5
3    1.5
4    4.5
dtype: float64
Rankings of Y:
0    1.5
1    1.5
2    4.0
3    4.0
4    4.0
dtype: float64
Spearman's Rank correlation: 0.456435464588

Python code to calculate Spearman’s Correlation using Scipy
There is one simple way to directly get the spearman’s correlation value using scipy.

## Python3

 # Import scipy.statsimport scipy.stats # Two lists x and yx = [15,18,21, 15, 21]y = [25,25,27,27,27] print(scipy.stats.spearmanr(x, y)) # This code is contributed by Manish KC# Profile: mkumarchaudhary06

Output:

0.45643546458763845`

My Personal Notes arrow_drop_up