Open In App

How to Calculate Partial Correlation in Excel?

Improve
Improve
Like Article
Like
Save
Share
Report

Partial correlation helps find the correlation between the two variables by removing the effect of the third variable. There can be situations when the relations between variables can be many. This could reduce the accuracy of correlation or could also give wrong results. Partial correlation removes the effects of other variables. Excel helps us find a partial correlation automatically by the formula. In this article, we will learn how to find partial correlations in excel. 

Correlation 

Before understanding partial correlation, we need to have a better understanding of correlation. Correlation is a way by which we can find how variables are related to each other. The value of correlation lies between -1 and 1, inclusive. Correlation helps find whether two variables are directly or indirectly proportional. A positive correlation signifies that the value of one will increase by increasing the other values. A negative correlation signifies that the value of one will increase, and the other will decrease. For example, there is a positive correlation between smoking and lung cancer. There is a negative correlation between sleep and productivity. Possible values of Correlation: 

Correlation(X, Y) > 0

A positive value signifies both variables are positively correlated. If one increases, then the other also increases, or if one decreases, then the other also decreases.

Correlation(X, Y) == 0

A correlation value of zero(0) signifies that there is no relation between the two variables. 

Correlation(X, Y) < 0

A negative value signifies both variables are negatively correlated. If one increases, then the other decreases, or if one decreases, then the other increases.

Need for Partial Correlation 

Correlation works well until a third variable comes into consideration, which is correlated to both of the variables. For example, you are given two variables, Blood Pressure, and Income. It’s generally seen that people with more income also have high blood pressure because of more amount of work they have to do. If that’s the condition, then Blood Pressure and Income should be highly correlated, but this is not correct if we consider other parameters also, like Age. A person who is young, having less age seems to have normal blood pressure, despite having a good income, or an aged person seems to have high blood pressure, even though his income is still not so good. This proves to us that Blood Pressure and Income are not highly correlated if considered Age is considered a factor. If, in such a situation, we use correlation, then it may lead to wrong results. 

To solve this issue, the concept of partial correlation was introduced. Partial correlation calculates the correlation of two variables by controlling a third variable that affects both variables. 

Formula to Calculate Partial Correlation

Consider two variables, X and Y, for which correlation has to be calculated. Consider a third variable, Z, which is related to both variables X and Y. The formula for calculating partial correlation is: 

PartialCorr(X, Y).Z = \frac{Corr(X, Y) - Corr(X, Z).Corr(Y, Z)} {\sqrt{1 - (Corr(X, Z))^2}.\sqrt{1 - ((Corr(Y, Z))^2}}

Where,

PartialCorr(X, Y).Z = Partial Correlation of X and Y variables, keeping in control of Z variable,

Corr(X, Y) = Correlation of X and Y,

Corr(Y, Z) = Correlation of Y and Z,

Corr(X, Z) = Correlation of X and Z. 

Procedure to Calculate Partial Correlation Using Excel 

We can automate our task of calculating Partial Correlation in excel. For example, “Arushi” is the “Maths” teacher of the 10th class. She had the data of students, consisting of internal assessment marks(Z), the number of hours a student play(X), and the final marks in Term2(Y). She wants to find the number of hours a student play is correlated to the final marks in Term2 by controlling the internal assessment marks. It means that she wants to find a Partial Correlation between X and Y, controlling Z

Dataset

 

Following are the steps

Step 1: Make four(4) new columns in range E2:E5, name Correlation(X, Y), Correlation(Y, Z), Correlation(X, Z) and PartialCorrelation(X,Y).Z. Use =correl(array1, array2) function to calculation the correlation between two variables. 

Making-column-for-different-correlation-values

 

Step 2: In cell F2, use =correl(B2:B6, C2:C6), function to calculate the correlation between Hours of Study(X) and Final marks(Y). Press Enter

Applying-Correl-function-for-Correlation(X,Y)

 

Step 3: The value of correlation between Hours of Study(X) and Final Marks(Y) is -0.77116

Final-marks-obtained

 

Step 4: In cell F3, use =correl(B2:B6, A2:A6) function to calculate the correlation between Hours of Study(X) and Internal Assessment Marks(Z). Press Enter.

Applying-Correl-function-for-Correlation(X,Z)

 

Step 5: The value of correlation between Hours of Study(X) and Internal Assessment Marks(Z) is 0.518751

Internal-assessment-marks-obtained

 

Step 6: In cell F4, use =correl(c2:C6, A2:A6), function to calculate the correlation between Final marks(Y) and Internal Assessment Marks(Z). Press Enter.

Applying-Correl-function-for-Correlation(Y,Z)

 

Step 7: The value of correlation between Final marks(Y) and Internal Assessment Marks(Z) is -0.01257

Internal-assessment-marks-obtained

 

Step 8: In cell F5, write the formula of partial correlation as explained above, i.e., =(F2-F3*F4)/(SQRT(1-F3*F3)*SQRT(1 – F4*F4)). Press Enter

Applying-function-for-Correlation(X,Y,Z)

 

Step 9: The value of partial correlation between Hours of Study(X) and Final Marks(Y), controlling Internal Assessment Marks(Z) is -0.89446.

Partial-correlation-value-obtained

 



Last Updated : 22 Aug, 2022
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads