Pearson Correlation Coefficient
Correlation coefficients are used to measure how strong a relationship is between two variables. There are different types of formulas to get correlation coefficient, one of the most popular is Pearson’s correlation (also known as Pearson’s R) which is commonly used for linear regression. The Pearson’s correlation coefficient is denoted with the symbol “R”. The correlation coefficient formula returns a value between 1 and -1. Here,
- -1 indicates a strong negative relationship
- 1 indicates strong positive relationships
- And a result of zero indicates no relationship at all
Pearson’s Correlation Coefficient Formula
The Pearson’s correlation coefficient formula is the most commonly used and the most popular formula to get the correlation coefficient. It is denoted with the capital “R”. The formula for Pearson’s correlation coefficient is shown below,
R= n(∑xy) – (∑x)(∑y) / √[n∑x²-(∑x)²][n∑y²-(∑y)²
The full name for Pearson’s correlation coefficient formula is Pearson’s Product Moment correlation (PPMC). It helps in displaying the Linear relationship between the two sets of the data.
The Pearson’s correlation helps in measuring the strength(it’s given by coefficient r-value between -1 and +1) and the existence (given by p-value )of a linear relationship between the two variables and if the outcome is significant we conclude that the correlation exists.
Cohen (1988) says that an absolute value of r of 0.5 is classified as large, an absolute value of 0.3 is classified as medium and an absolute value of 0.1 is classified as small.
The interpretation of the Pearson’s correlation coefficient is as follows:-
- A correlation coefficient of 1 means there is a positive increase of a fixed proportion of others, for every positive increase in one variable. Like, the size of the shoe goes up in perfect correlation with foot length.
- If the correlation coefficient is 0, it indicates that there is no relationship between the variables.
- A correlation coefficient of -1 means there is a negative decrease of a fixed proportion, for every positive increase in one variable. Like, the amount of water in a tank will decrease in a perfect correlation with the flow of a water tap.
Steps to find the correlation coefficient with Pearson’s correlation coefficient formula:
Step 1: Firstly make a chart with the given data like subject, x, and y and add three more columns in it xy,x² and y².
Step 2: Now multiply the x and y columns to fill the xy column. For example:- in x we have 24 and in y we have 65 so xy will be 24×65=1560.
Step 3: Now, take the square of the numbers in the x column and fill the x² column.
Step 4: Now, take the square of the numbers in the y column and fill the y² column.
Step 5: Now, add up all the values in the columns and put the result at the bottom. Greek letter sigma (Σ) is the short way of saying summation.
Step 6: Now, use the formula for Pearson’s correlation coefficient:-
R = n(∑xy) – (∑x)(∑y) / √[n∑x²-(∑x)²][n∑y²-(∑y)²
To know which type of variable we have either positive or negative.
Sample Problems
Problem 1: There is some correlation coefficient that was given to tell whether the variables are positive or negative?
0.69, 0.42, -0.23, -0.99
Solution:
The given correlation coefficient is as follows:
0.69, 0.42, -0.23, -0.99
Tell whether the relationship is negative or positive
0.69: The relationship between the variables is a strong positive relationship
0.42: The relationship between the variables is a strong positive relationship
-0.23: The relationship between the variables is a weak negative relationship
-0.99: The relationship between the variables is a very strong negative relationship
Problem 2: Calculate the correlation coefficient for the following data by the help of Pearson’s correlation coefficient formula:
X = 10, 13, 15 ,17 ,19
and
Y = 5,10,15,20,25.
Solution:
Given variables are,
X = 10, 13, 15 ,17 ,19
and
Y = 5,10,15,20,25.
To, find the correlation coefficient of the following variables Firstly a table is to be constructed as follows, to get the values required in the formula also add all the values in the columns to get the values used in the formula.
X Y XY X² Y² 10 5 50 100 25 13 10 130 169 100 15 15 225 225 225 17 20 340 289 400 19 25 475 362 625 ∑74 ∑75 ∑1103 ∑1144 ∑1375 ∑xy = 1103
∑x = 74
∑y = 75
∑x² = 1144
∑y² = 1375
n = 5
Put all the values in the Pearson’s correlation coefficient formula:-
R = n(∑xy) – (∑x)(∑y) / √ [n∑x²-(∑x)²][n∑y²-(∑y)²
R = 5(1103) – (74)(75) / √ [5(1144)-(74)²][5(1375)-(75)²]
R = -35 / √[244][1250]
R = -35/552.26
R = 0.0633
The correlation coefficient is 0.064
Problem 3: Calculate the correlation coefficient for the following table with the help of Pearson’s correlation coefficient formula:
SUBJECT | AGE X | Weight Y |
1 | 40 | 99 |
2 | 25 | 79 |
3 | 22 | 69 |
4 | 54 | 89 |
Solution:
Make a table from the given data and add three more columns of XY, X², and Y². also add all the values in the columns to get ∑xy, ∑x, ∑y, ∑x², and ∑y² and n =4.
SUBJECT AGE X Weight Y XY X² Y² 1 40 99 3960 1600 9801 2 25 79 1975 625 6241 3 22 69 1518 484 4761 4 54 89 4806 2916 7921 ∑ 151 336 12259 5625 28724 ∑xy = 12258
∑x = 151
∑y = 336
∑x² = 5625
∑y² = 28724
n = 4
Put all the values in the Pearson’s correlation coefficient formula:-
R = n(∑xy) – (∑x)(∑y) / √ [n∑x²-(∑x)²][n∑y²-(∑y)²
R = 4(12258) – (151)(336) / √ [4(5625)-(151)²][4(28724)-(336)²]
R = -1704 / √ [-301][-2000]
R = -1704/775.886
R = -2.1961
The correlation coefficient is -2.196
Problem 4: Calculate the correlation coefficient for the following data with the help of Pearson’s correlation coefficient formula:
X = 5 ,9 ,14, 16
and
Y = 6, 10, 16, 20 .
Solution:
Given variables are,
X = 5 ,9 ,14, 16
and
Y = 6, 10, 16, 20 .
To, find the correlation coefficient of the following variables Firstly a table to be constructed as follows, to get the values required in the formula
also, add all the values in the columns to get the values used in the formula.
X Y XY X² Y² 5 6 30 25 36 9 10 90 81 100 14 16 224 196 256 16 20 320 256 400 ∑ 44 ∑ 52 ∑ 664 ∑ 558 ∑ 792 ∑xy= 664
∑x=44
∑y=52
∑x² =558
∑y² =792
n =4
Put all the values in the Pearson’s correlation coefficient formula:-
R= n(∑xy) – (∑x)(∑y) / √ [n∑x²-(∑x)²][n∑y²-(∑y)²
R= 4(664) – (44)(52) / √ [4(558)-(44)²][4(792)-(52)²]
R= 368 / √[296][464]
R=368/370.599
R=0.994
The correlation coefficient is 0.994
Problem 5: Calculate the correlation coefficient for the following data by the help of Pearson’s correlation coefficient formula:
X = 21,31,25,40,47,38
and
Y = 70,55,60,78,66,80
Solution:
Given variables are,
X = 21,31,25,40,47,38
and
Y = 70,55,60,78,66,80
To, find the correlation coefficient of the following variables Firstly a table is to be constructed as follows, to get the values required in the formula also add all the values in the columns to get the values used in the formula.
X Y XY X² Y² 21 70 1470 441 4900 31 55 1705 961 3025 25 60 1500 625 3600 40 78 3120 1600 6084 47 66 3102 2209 4356 38 80 3040 1444 6400 ∑202 ∑409 ∑13937 ∑7280 ∑28265 ∑xy= 13937
∑x=202
∑y=409
∑x² =7280
∑y² =28265
n =6
Put all the values in the Pearson’s correlation coefficient formula:-
R= n(∑xy) – (∑x)(∑y) / √ [n∑x²-(∑x)²][n∑y²-(∑y)²
R= 6(13937) – (202)(409) / √ [6(7280)-(202)²][6(28265)-(409)²]
R= 1004 / √[2876][2909]
R=1004 / 2892.452938
R=-0.3471
The correlation coefficient is -0.3471
Problem 6: Calculate the correlation coefficient for the following data by the help of Pearson’s correlation coefficient formula:
SUBJECT | Height X | Weight Y |
1 | 43 | 78 |
2 | 24 | 68 |
3 | 26 | 85 |
4 | 35 | 67 |
Solution:
Make a table from the given data and add three more columns of XY , X² and Y² and add all the values in the columns to get ∑xy, ∑x, ∑y, ∑x² and ∑y² and n =4.
SUBJECT Height X Weight Y XY X² Y² 1 43 78 3354 1849 6084 2 24 68 1632 567 4624 3 26 85 2210 676 7225 4 35 67 2345 1225 4889 ∑ 128 298 9541 4317 22422 ∑xy= 9541
∑x=128
∑y=298
∑x² =4317
∑y² 22422
n =4
Put all the values in the Pearson’s correlation coefficient formula:-
R= n(∑xy) – (∑x)(∑y) / √ [n∑x²-(∑x)²][n∑y²-(∑y)²
R= 4(9541) – (128)(298) / √ [4(4317)-(128)²][4(22422)-(298)²]
R= 20 / √ [884][884]
R=20/884
R=0.02262
The correlation coefficient is 0.02262
Please Login to comment...