Open In App

How to calculate the Pearson’s Correlation Coefficient?

Improve
Improve
Like Article
Like
Save
Share
Report

The correlation topic comes under the statistics concept. It establishes the relationship between two variables. The correlation coefficient formula finds the correlation coefficient which measures the dependency of one variable on another variable. The correlation coefficient lies between -1 and 1. If the Correlation coefficient value is negative then the relation between two variables is inversely related. If the value is zero then there is no relationship between the two variables. If the correlation coefficient value is positive then one variable depends on another variable directly.

Pearson’s Correlation Coefficient

There are many types of correlation coefficients. The most common and used is Pearson’s Correlation Coefficient. The Pearson’s Correlation Coefficient  is represented with the letter ‘r’ and its formula for the data of size N is given by-

r = cov(x,y) / σxσy

\frac{∑(x-\bar{x})(y-\bar{y})}{\sqrt{∑(x-\bar{x})^2}\sqrt{∑(y-\bar{y})^2}}

Where cov represents covariance

σ represents standard deviation

xand y represents means

Sample Problems

Question 1: Find the Pearson Correlation coefficient formula for the given data

x

y

1

2

2

4

3

6

Solution:

x

y

x-\bar{x}

y-\bar{y}

(x-\bar{x})^2

(y-\bar{y})^2

(x-\bar{x})(y-\bar{y})

1

2

-1

-2

1

4

2

2

4

0

0

0

0

0

3

6

1

2

1

4

2

\bar{x} =(1+2+3)/3

=6/3

\bar{x}=2

\bar{y} =(2+4+6)/3

=12/3

\bar{y}=4

 

 

∑(x-\bar{x})^2=2

∑(y-\bar{y})^2=8

∑(x-\bar{x})(y-\bar{y})=4

Pearson coefficient (r) = \frac{∑(x-\bar{x})(y-\bar{y})}{\sqrt{∑(x-\bar{x})^2}\sqrt{∑(y-\bar{y})^2}}

= 4/(√2√8)

= 4/4

= 1

Hence the two variables x and y are directly depend on each other.

Question 2: Find the Pearson Correlation coefficient formula for the given data

x

y

6

12

9

10

12

20

Solution:

x

y

(x-\bar{x})

(y-\bar{y})

(x-\bar{x})^2

(y-\bar{y})^2

(x-\bar{x})(y-\bar{y})

6

12

-3

-2

9

4

6

9

10

0

-4

0

16

0

12

20

3

6

9

36

18

\bar{x}= (6+9+12)/3

=27/3

\bar{x}=9

\bar{y}= (12+10+20)/3

=42/3

\bar{y}=14

 

 

 

∑(y-\bar{y})^2=56

∑(x-\bar{x})(y-\bar{y})=24

Pearson coefficient (r) = \frac{∑(x-\bar{x})(y-\bar{y})}{\sqrt{∑(x-\bar{x})^2}\sqrt{∑(y-\bar{y})^2}}

= 24/(√18√56)

= 24/(3√2 × 2√14)

= 4/2√7

= 2/√7

Pearson coefficient = 0.75

Question 3: Find the Pearson Correlation coefficient formula for the given data

x

y

1

9

2

1

3

2

4

8

Solution:

xyx-\bar{x}y-\bar{y}(x-\bar{x})^2(y-\bar{y})^2(x-\bar{x})(y-\bar{y})
19-1.542.2516-6
21-0.5-40.25162
320.5-30.259-1.5
481.532.2594.5

\bar{x}= (1+2+3+4)/4

=10/4

=2.5

\bar{y}= (9+1+8+2)/4

=20/4

=5

  ∑(x-\bar{x})^2=5∑(y-\bar{y})^2=50∑(x-\bar{x})(y-\bar{y})=-1

= -1/(√5√50)

= -1/(5√10)

Pearson coefficient = -0.63

Negative value of Pearson coefficient indicates that 2 variables has less dependency between them.

Question 4: Find the Pearson Correlation coefficient formula for the given data

x

10

5

20

7

y

2

9

10

1

Solution:

x

y

x-\bar{x}

y-\bar{y}

(x-\bar{x})^2

(y-\bar{y})^2

(x-\bar{x})(y-\bar{y})

10

2

-0.5

-3.5

0.25

12.25

1.75

5

9

-5.5

3.5

30.25

12.25

-19.25

20

10

9.5

4.5

90.25

20.25

42.75

7

1

-3.5

-4.5

12.25

20.25

15.75

\bar{x}= (10+5+20+7)/4

=42/4

=10.5

\bar{y}= (2+9+10+1)/4

=22/4

=5.5

 

 

∑(x-\bar{x})^2=133

∑(y-\bar{y})^2=65

∑(x-\bar{x})(y-\bar{y})=41

Pearson coefficient (r) = \frac{∑(x-\bar{x})(y-\bar{y})}{\sqrt{∑(x-\bar{x})^2}\sqrt{∑(y-\bar{y})^2}}

= 41/(√133√65)

Pearson coefficient = 0.44

Question 5: Find the Pearson Correlation coefficient formula for the given data

x

y

1

11

2

22

3

34

Solution:

xyx-\bar{x}y-\bar{y}(x-\bar{x})^2(y-\bar{y})^2(x-\bar{x})(y-\bar{y})
111-1-11.331128.3711.33
2220-0.3300.10
334111.671136.1911.67
\bar{x}=2

\bar{y}= (11+22+34)/3

=67/3

=22.33

  ∑(x-\bar{x})^2=2∑(y-\bar{y})^2=264.66∑(x-\bar{x})(y-\bar{y})=23  

Pearson coefficient (r) = \frac{∑(x-\bar{x})(y-\bar{y})}{\sqrt{∑(x-\bar{x})^2}\sqrt{∑(y-\bar{y})^2}}

= 23/(√2√264.66)

Pearson coefficient = 0.99

Indicates that the two variables x and y are directly depend on each other.



Last Updated : 02 May, 2022
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads