Open In App

How to Calculate the Coefficient of Determination?

In mathematics, the study of data collection, analysis, perception, introduction, organization of data falls under statistics. In statistics, the coefficient of determination is utilized to notice how the contrast of one variable can be defined by the contrast of another variable. Like, whether a person will get a job or not they have a direct relationship with the interview that he/she has given. Particularly, R-squared gives the percentage variation of y defined by the x-variables. It varies between 0 to 1(so, 0% to 100% variation of y can be defined by x-variables). It is similar to the correlation coefficient (R). The correlation coefficient tells how strong a linear relationship is there between the two variables and R-squared is the square of the correlation coefficient(termed as r squared).

Coefficient of determination



The coefficient of determination can be seen as a percent. It provides an opinion that how multiple data points can fall within the outcome of the line created by the reversal equation. The more increased the coefficient, the more elevated will be the percentage of the facts line passes through when the data points and the line consumed plotted. Or we can say that the coefficient of determination is the proportion of variance in the dependent variable that is predicted from the independent variable. If the coefficient is 0.70, then 70% of the points will drop within the regression line. A more increased coefficient is the indicator of a more suitable worth of fit for the statements. The values of 1 and 0 must show the regression line that conveys none or all of the data.

If the coefficient of determination (CoD) is unfavorable, then it means that your sample is an imperfect fit for your data. It can become unfavorable if the intercept isn’t set.



The coefficient of determination is typically written as R2_p. Here, the p denotes the numeral of the columns of data that is valid while resembling the R2 of the various data sets.

Properties of Coefficient of Determination

Formula of coefficient of determination

The formula of coefficient of determination can be written in two different ways:

Formula 1: 

R = n(∑xy) – (∑x)(∑y) / √[n∑x2 – (∑x)2][n∑y2 – (∑y)2]    

Here, R represents the coefficient of determination, n is known as the total number of observations, ∑x is known as a total of first variable values, ∑y is known as the second variable values, ∑xy is known as the sum of the product of the first and second values, ∑x is known as the sum of the square of the first value and ∑y is known as the sum of the square of the second value

Formula 2:

R2 = 1 -(RSS/TSS)

Here, R represents the coefficient of determination, RSS is known as the residuals sum of squares, and TSS is known as the total sum of squares.

Steps to calculate the coefficient of determination

Step 1: Firstly find the correlation coefficient(or maybe it is mentioned in the question for e.g, r = 0.467).

R = n(∑xy) – (∑x)(∑y) / √[n∑x2 – (∑x)2][n∑y2 – (∑y)2]                                                       

Step 2: Now square the correlation coefficient

0.6572 =.432

Step 3: Now convert the correlation coefficient(R) into the percentage

.432 = 43.2%

Sample Question

Question 1: Find the correlation of determination from the following given data?

SUBJECT AGE X

GLUCOSE 

LEVEL Y

1 42 98
2 23 68
3 22 73
4 47 79
5 50 88
6 60 82

Solution:

Firstly to get the CoD to find out the correlation coefficient of the given data. Make a table from the given data and add three more columns of XY, X², and Y².add all the values in the columns to get ∑xy, ∑x, ∑y, ∑x2, and ∑y2 and n = 6.

SUBJECT AGE X

GLUCOSE

LEVEL Y

XY  X2  Y2
1 42 98 4116 1764 9604
2 23 68 1564 529 4624
3 22 73 1606 484 5329
4 47 79 3713 2209 6241
5 50 88 4400 2500 7744
6 60 82 4980 3600 6724
244 488 20379 11086 40266

∑xy = 20379

∑x = 244

∑y = 488

∑x2 = 11086

∑y2 = 40266

n = 6.

Put all the values in the coefficient of determination formula:-

R = n(∑xy) – (∑x)(∑y) / √[n∑x2 – (∑x)2][n∑y2 – (∑y)2]     

R = 6(20379) – (244)(488) / √ [6(11086) – (244)2][6(40266) – (488)2]

R = 3202/√ [6980][3452] 

R = 3202/4972.238

R = 0.6439

Now do the square of correlation coefficient

R2 = (0.6439)2 = .415

Convert the R-squared into the percentage

.415 × 100 = 41.5%

So, 41.5% variation of y can be explained by x-variables.

Question 2: Find the coefficient of determination from the following given data?

X = 21, 31, 25, 40, 47, 38 and Y = 70, 55, 60, 78, 66, 80

Solution:-

Given variables are,

X = 21, 31, 25, 40, 47, 38

and

Y = 70, 55, 60, 78, 66, 80

Firstly to get the CoD to find out the correlation coefficient of the given data. To, find the correlation coefficient of the following variables Firstly a table is to be constructed as follows, to get the values required in the formula.

X Y XY  X2  Y2
5 6 30 25 36
9 10 90 81 100
14 16 224 196 256
16 20 320 256 400
∑ 44 ∑ 52 ∑ 664 ∑ 558 ∑ 792

∑xy =  664

∑x = 44

∑y = 52

∑x2 = 558

∑y2 = 792

n = 4

Put all the values in the coefficient of determination formula:

R = n(∑xy) – (∑x)(∑y) / √[n∑x2 – (∑x)2][n∑y2 – (∑y)2]  

R = 6(13937) – (202)(409) / √ [6(7280) – (202)2][6(28265) – (409)2]

R = 1004 / √[2876][2909]

R = 1004 / 2892.452938

R = -0.3471

Now do the square of correlation coefficient

R2 = (0.3471)2 = .129

Convert the R-squared into the percentage

.129 × 100 = 12.9%

So, 12.9% variation of y can be explained by x-variables.  

Question 3: Given X = 5 ,9 ,14, 16 and Y = 6, 10, 16, 20. Find the coefficient of determination.

Solution:

Given variables are,

X = 5 ,9 ,14, 16

and

Y = 6, 10, 16, 20

Firstly to get the CoD to find out the correlation coefficient of the given data. To, find the correlation coefficient of the following variables Firstly a table is to be constructed as follows, to get the values required in the formula.

X Y XY X2  Y2
5 6 30 25 36
9 10 90 81 100
14 16 224 196 256
16 20 320 256 400
∑44 ∑52 ∑664 ∑558 ∑792

∑xy = 664

∑x = 44

∑y = 52

∑x2 = 558

∑y2 = 792

n = 4

Put all the values in the coefficient of determination formula:-

R = n(∑xy) – (∑x)(∑y) / √[n∑x2 – (∑x)2][n∑y2 – (∑y)2]  

R = 4(664) – (44)(52) / √ [4(558) – (44)2][4(792) – (52)2]

R = 368 /  √[296][464]

R = 368/370.599

R = 0.993

Now do the square of correlation coefficient

R2 = (0.993)2 = .987

Convert the R-squared into the percentage

.987 × 100 = 98.7%

So, 98.7% variation of y can explained by x-variables.

Question 4: The correlation coefficient is .6894. Find out the coefficient of determination.

Solution:

The correlation coefficient = .6894 (square the correlation coefficient)

R = .6894

R2 = .476

Convert the R-squared into the percentage

.476 × 100 = 47.6

The coefficient of determination is 47.6 percent.

Question 5: The correlation coefficient is .3659. Find out the coefficient of determination.

Solution:

The correlation coefficient = .3659 (square the correlation coefficient)

R = .3659

R2 = .134

Convert the R-squared into the percentage

.134 ×100 = 13.4

The coefficient of determination is 13.4


Article Tags :