Open In App

How to Calculate Partial Correlation in R?

Last Updated : 19 Dec, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will discuss how to calculate Partial Correlation in the R Programming Language.

Partial Correlation helps measure the degree of association between two random variables when there is the effect of other variables that control them. in partial correlation in machine learning It gives a precise relationship between two random variables with the effect of other variables that also affect them. 

To calculate partial correlation in machine learning in the R Language, we use the pcor() function of the ppcor package library. The ppcor package library helps us to calculate partial and semi-partial correlations along with the p-value. The pcor() function helps us to calculate the pairwise partial correlations for each pair of variables given others. It also gives us the p-value as well as the statistic for each pair of variables. 

To use the pcor() function, we first need to install the ppcor package library. To install the ppcor library, we use

install.packages("ppcor")

After installation, we can load the ppcor library by using the library() function. Then use the following syntax to calculate the Partial Correlation in the R Language.

Syntax:

pcor( df )

Parameter:

  • df: determines the data frame whose partial correlation is to be calculated.

Basic example of partial correlation in machine learning with two columns of the data frame

R




# create sample data frame
sample_data <- data.frame( x= c(1,2,3,4,5,6,7,7,7,8),
                           y= c(4,5,6,7,8,9,9,9,10,10))
 
# load library ppcor
library(ppcor)
 
# calculate Partial Correlation
pcor( sample_data )


Output:

$estimate
x y
x 1.0000000 0.9854592
y 0.9854592 1.0000000
$p.value
x y
x 0.000000e+00 1.921901e-07
y 1.921901e-07 0.000000e+00
$statistic
x y
x 0.00000 16.40436
y 16.40436 0.00000
$n
[1] 10
$gp
[1] 0
$method
[1] "pearson"

Here, the partial correlation value between x and y is 0.9854592, which signifies that x and y are highly consistent and they increase with each other.

Basic example of partial correlation in machine learning with three columns of the data frame

R




# create sample data frame
sample_data <- data.frame( x= c(1,2,3,4,5,6,7,7,7,8),
                           y= c(4,5,6,7,8,9,9,9,10,10),
                           z= c(1,3,5,7,9,11,13,15,17,19))
 
# load library ppcor
library(ppcor)
 
# calculate Partial Correlation
pcor( sample_data )


Output:

$estimate
x y z
x 1.0000000 0.76314445 0.58810321
y 0.7631444 1.00000000 0.05552034
z 0.5881032 0.05552034 1.00000000
$p.value
x y z
x 0.00000000 0.01673975 0.09578687
y 0.01673975 0.00000000 0.88718502
z 0.09578687 0.88718502 0.00000000
$statistic
x y z
x 0.000000 3.1244245 1.9238403
y 3.124425 0.0000000 0.1471199
z 1.923840 0.1471199 0.0000000
$n
[1] 10
$gp
[1] 1
$method
[1] "pearson"

Here, the partial correlation in machine learning value between x and y is changed from the above example when the x and y vector is still the same because the z vector is affecting them. So now the correlation value dropped to 0.76314445 from 0.9854592 because x and z are inconsistent with the value of 0.58810321.



Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads