Open In App
Related Articles

Linear Correlation Coefficient Formula

Improve Article
Improve
Save Article
Save
Like Article
Like

Correlation coefficients are used to measure how strong a relationship is between two variables. There are different types of formulas to get correlation coefficient, one of the most popular is Pearson’s correlation (also known as Pearson’s R) which is commonly used for linear regression. The Pearson’s correlation coefficient is denoted with the symbol “R”. The correlation coefficient formula returns a value between 1 and -1. Here,

  • 1 indicates strong positive relationships
  • -1 indicates strong negative relationships
  • And a result of zero indicates no relationship at all

Linear Correlation Coefficient Formula

The linear correlation coefficient is known as Pearson’s r or Pearson’s correlation coefficient. Which reflects the direction and strength of the linear relationship between the two variables x and y. It returns a value between -1 and +1. In this -1 indicates a strong negative correlation and +1 indicates a strong positive correlation. If it lies 0 then there is no correlation. This is also known as zero correlation.

The “crude estimates” for interpreting strengths of correlations using Pearson’s Correlation:

r valuecrude estimates
+.70 or higher A very strong positive relationship
+.40 to +.69 Strong positive relationship
+.30 to +.39Moderate positive relationship
+.20 to +.29weak positive relationship
+.01 to +.19No or negligible relationship
0No relationship [zero correlation]
-.01 to -.19No or negligible relationship
-.20 to -.29weak negative relationship
-.30 to -.39Moderate negative relationship
-.40 to -.69 Strong negative relationship
-.70 or higher The very strong negative relationship

The formula used to get the linear correlation coefficient of the data is :

R = n(∑xy) – (∑x)(∑y) / √[n∑x²-(∑x)²][n∑y²-(∑y)²

Explain the types of linear correlation coefficients?

The linear correlation coefficient is reflected by Pearson’s r. So, the value of r can be range between +1 and -1.

There are three types of linear correlation coefficient as follows:

Positive values indicate a Positive Correlation (0<r1)

Negative values indicate a Negative Correlation (-1r<1)

A Value of 0 indicates No Correlation (r=0)

Positive correlation: In positive correlation both the variables move in the same direction. If one increases the other also increases and if one decreases the other also decreases. Whenever the r indicates a positive value it shows a positive relationship

Negative correlation: In negative correlation both the variables move in different directions. If one increases the other decreases and if one decreases the other increases. Whenever the r indicates a negative value it shows a negative relationship

No correlation: when there is no statistical association between the variables. They are said to have no correlation. In this case, their correlation coefficient (also known as r) is 0.

Sample Problems

Problem 1: Calculate the correlation coefficient for the following data:

X = 5, 9,14, 16

and

Y = 6, 10, 16, 20

Solution:

Given variables are,

X = 12,16 ,4, 8

and

Y = 15, 20, 55, 10

To, find the correlation coefficient of the following variables Firstly a table is to be constructed as follows, to get the values required in the formula also add all the values in the columns to get the values used in the formula

XYXY
56180144225
910320256400
1416201620
16208056100
∑40∑50∑600∑480∑750

∑xy = 600

∑x = 40

∑y = 50

∑x² = 470

∑y² = 750

n = 4

Put all the values in the Pearson’s correlation coefficient formula:-

R = n(∑xy) – (∑x)(∑y) / √[n∑x²-(∑x)²][n∑y²-(∑y)² 

R = 4(600) – (40)(50) / √[4(470)-(40)²][4(750)-(50)²]

R = 400 / √[320][500]

R = 400/400

R =1

It shows that the relationship between the variables of the data is a very strong positive relationship.

Problem 2: Find the value of the correlation coefficient from the following table:

SUBJECTAGE XGLUCOSE LEVEL Y
14298
22368
32273
44779
55088
66082

Solution:

Make a table from the given data and add three more columns of XY, X², and Y² also add all the values in the columns to get ∑xy, ∑x, ∑y, ∑x², and ∑y² and n =6.

SUBJECTAGE X

GLUCOSE

 LEVEL Y

XY Y²
14298411617649604
2236815645294624
3227316064845329
44779371322096241
55088440025007744
66082498036006724
244488203791108640266

∑xy= 20379

∑x=244

∑y=488

∑x² =11086

∑y² =40266

n =6.

Put all the values in the Pearson’s correlation coefficient formula:-

R = n(∑xy) – (∑x)(∑y) / √ [n∑x²-(∑x)²][n∑y²-(∑y)² 

R = 6(20379) – (244)(488) / √ [6(11086)-(244)²][6(40266)-(488)² 

R = 3202 / √ [6980][3452]

R = 3202/4972.238

R = 0.6439

It shows that the relationship between the variables of the data is a strong positive relationship.

Problem 3: Calculate the correlation coefficient for the following data:

X = 21,31,25,40,47,38

and

Y = 70,55,60,78,66,80

Solution:

Given variables are,

X = 21,31,25,40,47,38

and

Y = 70,55,60,78,66,80

To, find the correlation coefficient of the following variables Firstly a table is to be constructed as follows, to get the values required in the formula also add all the values in the columns to get the values used in the formula

XYXY
217014704414900
315517059613025
256014006253600
4078312016006084
4766310222094356
3880304014446400
∑202∑409∑13937∑7280∑28265

∑xy= 13937

∑x=202

∑y=409

∑x² =7280

∑y² =28265

n =6

Put all the values in the Pearson’s correlation coefficient formula:-

R=  n(∑xy) – (∑x)(∑y) / √ [n∑x²-(∑x)²][n∑y²-(∑y)² 

R= 6(13937) – (202)(409) / √ [6(7280)-(202)²][6(28265)-(409)²]

R= 1004 / √[2876][2909]

R=1004 / 2892.452938

R=-0.3471

It shows that the relationship between the variables of the data is a moderate positive relationship.

Problem 4: Calculate the correlation coefficient for the following data:

X= 12, 10, 42, 27,35,56

and

Y = 13, 15, 56, 34,65,26

Solution:

Given variables are,

X= 12, 10, 42, 27,35,56

and

Y = 13, 15, 56, 34,65,26

To, find the correlation coefficient of the following variables Firstly a table is to be constructed as follows, to get the values required in the formula also add all the values in the columns to get the values used in the formula

XYXY
1213156144169
1015150100225
4256235317643136
27349187291156
3565227512254225
562614563136676
∑182∑209∑7307∑7098∑9587

∑xy= 7307

∑x=182

∑y=209

∑x² =7098

∑y² =9587

n =6

Put all the values in the Pearson’s correlation coefficient formula:-

R=  n(∑xy) – (∑x)(∑y) / √ [n∑x²-(∑x)²][n∑y²-(∑y)²

R=  6(7307) – (182)(209)  / √ [6(7098)-(182)²][6(9587)-(209)²] 

R= 5804 / √[9464][13841]

R= 5804/11445.139

R=0.5071

It shows that the relationship between the variables of the data is a strong positive relationship.

Problem 5: There is some correlation coefficient that was given to tell whether the variables are positive or negative?

0.69

0.42

-0.23

-0.99

Solution:

The given correlation coefficient is as follows:

0.64

0.46

-0.29

-0.95

Tell whether the relationship is negative or positive

0.64

The relationship between the variables is a strong positive relationship

0.46

The relationship between the variables is a strong positive relationship

-0.29

The relationship between the variables is a weak negative relationship

-0.95

The relationship between the variables is a very strong negative relationship.

Problem 6: Calculate the correlation coefficient for the following data:

X = 10, 13, 15 ,17 ,19

and

Y = 5,10,15,20,25.

Solution:

Given variables are,

X = 10, 13, 15 ,17 ,19

and

Y = 5,10,15,20,25.

To, find the correlation coefficient of the following variables Firstly a table is to be constructed as follows, to get the values required in the formula also add all the values in the columns to get the values used in the formula.

XYXY X² Y²
1055010025
1310130169100
1515225225225
1720340289400
1925475361625
∑74∑75∑1103∑1144∑1375

∑xy= 1103

∑x=74

∑y=75

∑x² =1144

∑y² =1375

n =5

Put all the values in the Pearson’s correlation coefficient formula:-

R=  n(∑xy) – (∑x)(∑y) / √ [n∑x²-(∑x)²][n∑y²-(∑y)²

R=  5(1103) – (74)(75)  / √ [5(1144)-(74)²][5(1375)-(75)²] 

R= -35 / √[244][1250]

 R= -35/552.26

 R=0.0633

It shows that the relationship between the variables of the data is a negligible relationship.

Problem 7: Find the value of the correlation coefficient from the following table:

SUBJECTAGE XWeight Y
14099
22579
32269
45489

Solution:

SUBJECTAGE XWeight YXY X²
14099396016009801
2257919756256241
3226915184844761
45489480629167921
15133612259562528724

∑xy= 12258

∑x=151

∑y=336

∑x² =5625

∑y² 28724

n =4

Put all the values in the Pearson’s correlation coefficient formula:-

R=  n(∑xy) – (∑x)(∑y) / √ [n∑x²-(∑x)²][n∑y²-(∑y)²

R= 4(12258) – (151)(336) / √ [4(5625)-(151)²][4(28724)-(336)²]

R= -1704 / √ [-301][-2000]

 R=-1704/775.886

R=-2.1961

It shows that the relationship between the variables of the data is a very strong negative relationship.


Last Updated : 16 Feb, 2022
Like Article
Save Article
Similar Reads
Related Tutorials