Skip to content
Related Articles

Related Articles

How to Calculate the Coefficient of Determination?

View Discussion
Improve Article
Save Article
  • Last Updated : 11 May, 2022

In mathematics, the study of data collection, analysis, perception, introduction, organization of data falls under statistics. In statistics, the coefficient of determination is utilized to notice how the contrast of one variable can be defined by the contrast of another variable. Like, whether a person will get a job or not they have a direct relationship with the interview that he/she has given. Particularly, R-squared gives the percentage variation of y defined by the x-variables. It varies between 0 to 1(so, 0% to 100% variation of y can be defined by x-variables). It is similar to the correlation coefficient (R). The correlation coefficient tells how strong a linear relationship is there between the two variables and R-squared is the square of the correlation coefficient(termed as r squared).

Coefficient of determination

The coefficient of determination can be seen as a percent. It provides an opinion that how multiple data points can fall within the outcome of the line created by the reversal equation. The more increased the coefficient, the more elevated will be the percentage of the facts line passes through when the data points and the line consumed plotted. Or we can say that the coefficient of determination is the proportion of variance in the dependent variable that is predicted from the independent variable. If the coefficient is 0.70, then 70% of the points will drop within the regression line. A more increased coefficient is the indicator of a more suitable worth of fit for the statements. The values of 1 and 0 must show the regression line that conveys none or all of the data.

If the coefficient of determination (CoD) is unfavorable, then it means that your sample is an imperfect fit for your data. It can become unfavorable if the intercept isn’t set.

The coefficient of determination is typically written as R2_p. Here, the p denotes the numeral of the columns of data that is valid while resembling the R2 of the various data sets.

  • If R2 = 0, then the dependent variable cannot be predicted from the independent variable. 
  • If R2 = 1, then the dependent variable can be predicted from the independent variable. 
  • If R2 = between 0 and 1, then that means the dependent variable can be predictable.

Properties of Coefficient of Determination

  • It allows getting the balance of how a variable that can be expected from the other one, alters.
  • If we like to review how precise it is to make forecasts from the data provided, we can choose the same by this measure.
  • It allows finding Illustrated interpretation/ Total Interpretation
  • It also allows us to know the power of the connection(linear) between the variables.
  • If the matter of r2 gets near to 1, The matters of y evolve near to the reversal line, and likewise, if it reaches close to 0, the values get away from the reversal line.
  • It helps in defining the power of connection between distinct variables.

Formula of coefficient of determination

The formula of coefficient of determination can be written in two different ways:

Formula 1: 

R = n(∑xy) – (∑x)(∑y) / √[n∑x2 – (∑x)2][n∑y2 – (∑y)2]    

Here, R represents the coefficient of determination, n is known as the total number of observations, ∑x is known as a total of first variable values, ∑y is known as the second variable values, ∑xy is known as the sum of the product of the first and second values, ∑x is known as the sum of the square of the first value and ∑y is known as the sum of the square of the second value

Formula 2:

R2 = 1 -(RSS/TSS)

Here, R represents the coefficient of determination, RSS is known as the residuals sum of squares, and TSS is known as the total sum of squares.

Steps to calculate the coefficient of determination

Step 1: Firstly find the correlation coefficient(or maybe it is mentioned in the question for e.g, r = 0.467).

R = n(∑xy) – (∑x)(∑y) / √[n∑x2 – (∑x)2][n∑y2 – (∑y)2]                                                       

Step 2: Now square the correlation coefficient

0.6572 =.432

Step 3: Now convert the correlation coefficient(R) into the percentage

.432 = 43.2%

Sample Question

Question 1: Find the correlation of determination from the following given data?

SUBJECTAGE X

GLUCOSE 

LEVEL Y

14298
22368
32273
44779
55088
66082

Solution:

Firstly to get the CoD to find out the correlation coefficient of the given data. Make a table from the given data and add three more columns of XY, X², and Y².add all the values in the columns to get ∑xy, ∑x, ∑y, ∑x2, and ∑y2 and n = 6.

SUBJECTAGE X

GLUCOSE

LEVEL Y

XY X2 Y2
14298411617649604
2236815645294624
3227316064845329
44779371322096241
55088440025007744
66082498036006724
244488203791108640266

∑xy = 20379

∑x = 244

∑y = 488

∑x2 = 11086

∑y2 = 40266

n = 6.

Put all the values in the coefficient of determination formula:-

R = n(∑xy) – (∑x)(∑y) / √[n∑x2 – (∑x)2][n∑y2 – (∑y)2]     

R = 6(20379) – (244)(488) / √ [6(11086) – (244)2][6(40266) – (488)2]

R = 3202/√ [6980][3452] 

R = 3202/4972.238

R = 0.6439

Now do the square of correlation coefficient

R2 = (0.6439)2 = .415

Convert the R-squared into the percentage

.415 × 100 = 41.5%

So, 41.5% variation of y can be explained by x-variables.

Question 2: Find the coefficient of determination from the following given data?

X = 21, 31, 25, 40, 47, 38 and Y = 70, 55, 60, 78, 66, 80

Solution:-

Given variables are,

X = 21, 31, 25, 40, 47, 38

and

Y = 70, 55, 60, 78, 66, 80

Firstly to get the CoD to find out the correlation coefficient of the given data. To, find the correlation coefficient of the following variables Firstly a table is to be constructed as follows, to get the values required in the formula.

XYXY X2 Y2
56302536
9109081100
1416224196256
1620320256400
∑ 44∑ 52∑ 664∑ 558∑ 792

∑xy =  664

∑x = 44

∑y = 52

∑x2 = 558

∑y2 = 792

n = 4

Put all the values in the coefficient of determination formula:

R = n(∑xy) – (∑x)(∑y) / √[n∑x2 – (∑x)2][n∑y2 – (∑y)2]  

R = 6(13937) – (202)(409) / √ [6(7280) – (202)2][6(28265) – (409)2]

R = 1004 / √[2876][2909]

R = 1004 / 2892.452938

R = -0.3471

Now do the square of correlation coefficient

R2 = (0.3471)2 = .129

Convert the R-squared into the percentage

.129 × 100 = 12.9%

So, 12.9% variation of y can be explained by x-variables.  

Question 3: Given X = 5 ,9 ,14, 16 and Y = 6, 10, 16, 20. Find the coefficient of determination.

Solution:

Given variables are,

X = 5 ,9 ,14, 16

and

Y = 6, 10, 16, 20

Firstly to get the CoD to find out the correlation coefficient of the given data. To, find the correlation coefficient of the following variables Firstly a table is to be constructed as follows, to get the values required in the formula.

XYXYX2 Y2
56302536
9109081100
1416224196256
1620320256400
∑44∑52∑664∑558∑792

∑xy = 664

∑x = 44

∑y = 52

∑x2 = 558

∑y2 = 792

n = 4

Put all the values in the coefficient of determination formula:-

R = n(∑xy) – (∑x)(∑y) / √[n∑x2 – (∑x)2][n∑y2 – (∑y)2]  

R = 4(664) – (44)(52) / √ [4(558) – (44)2][4(792) – (52)2]

R = 368 /  √[296][464]

R = 368/370.599

R = 0.993

Now do the square of correlation coefficient

R2 = (0.993)2 = .987

Convert the R-squared into the percentage

.987 × 100 = 98.7%

So, 98.7% variation of y can explained by x-variables.

Question 4: The correlation coefficient is .6894. Find out the coefficient of determination.

Solution:

The correlation coefficient = .6894 (square the correlation coefficient)

R = .6894

R2 = .476

Convert the R-squared into the percentage

.476 × 100 = 47.6

The coefficient of determination is 47.6 percent.

Question 5: The correlation coefficient is .3659. Find out the coefficient of determination.

Solution:

The correlation coefficient = .3659 (square the correlation coefficient)

R = .3659

R2 = .134

Convert the R-squared into the percentage

.134 ×100 = 13.4

The coefficient of dete


My Personal Notes arrow_drop_up
Recommended Articles
Page :

Start Your Coding Journey Now!