Skip to content
Related Articles

Related Articles

Covariance and Correlation in R Programming
  • Last Updated : 01 Jun, 2020

Covariance and Correlation are terms used in statistics to measure relationships between two random variables. Both of these terms measure linear dependency between a pair of random variables or bivariate data.
In this article, we are going to discuss cov(), cor() and cov2cor() functions in R which use covariance and correlation methods of statistics and probability theory.

Covariance

In R programming, covariance can be measured using cov() function. Covariance is a statistical term used to measures the direction of the linear relationship between the data vectors. Mathematically,

  \operatorname{Cov}(x, y)=\frac{\Sigma\left(x_{i}-\bar{x}\right)\left(y_{i}-\bar{y}\right)}{N}

where,

x represents the x data vector
y represents the y data vector
\bar{x} represents mean of x data vector
\bar{y} represents mean of y data vector
N represents total obeservations



Syntax:

cov(x, y, method)

where,

x and y represents the data vectors
method defines the type of method to be used to compute covariance. Default is "pearson".

Example:




# Data vectors
x <- c(1, 3, 5, 10)
  
y <- c(2, 4, 6, 20)
  
# Print covariance using different methods
print(cov(x, y))
  
print(cov(x, y, method = "pearson"))
  
print(cov(x, y, method = "kendall"))
  
print(cov(x, y, method = "spearman"))

Output:

[1] 30.66667
[1] 30.66667
[1] 12
[1] 1.666667

Correlation

cor() function in R programming measures the correlation coefficient value. Correlation is a relationship term in statistics that uses the covariance method to measure how strong the vectors are related. Mathematically,

 \operatorname{Corr}(x, y)=\frac{\sum\left(x_{i}-\bar{x}\right)\left(y_{i}-\bar{y}\right)}{\sqrt{\sum\left(x_{i}-\bar{x}\right)^{2} \sum\left(y_{i}-\bar{y}\right)^{2}}}

where,

x represents the x data vector
y represents the y data vector
\bar{x} represents mean of x data vector
\bar{y} represents mean of y data vector



Syntax:

cor(x, y, method)

where,

x and y represents the data vectors
method defines the type of method to be used to compute covariance. Default is "pearson".

Example:




# Data vectors
x <- c(1, 3, 5, 10)
  
y <- c(2, 4, 6, 20)
  
# Print correlation using different methods
print(cor(x, y))
  
print(cor(x, y, method = "pearson"))
  
print(cor(x, y, method = "kendall"))
  
print(cor(x, y, method = "spearman"))

Output:

[1] 0.9724702
[1] 0.9724702
[1] 1
[1] 1

Conversion of Covariance to Correlation

cov2cor() function in R programming converts a covariance matrix into corresponding correlation matrix.

Syntax:

cov2cor(X)

where,
X
and y represents the covariance square matrix

Example:




# Data vectors
x <- rnorm(2)
y <- rnorm(2)
  
# Binding into square matrix
mat <- cbind(x, y)
  
# Defining X as the covariance matrix
X <- cov(mat)
  
# Print covariance matrix
print(X)
  
# Print correlation matrix of data vector
print(cor(mat))
  
# Using function cov2cor()
# To convert covariance matrix to correlation matrix
print(cov2cor(X))

Output:

           x          y
x  0.0742700 -0.1268199
y -0.1268199  0.2165516

   x  y
x  1 -1
y -1  1

   x  y
x  1 -1
y -1  1

Attention reader! Don’t stop learning now. Get hold of all the important DSA concepts with the DSA Self Paced Course at a student-friendly price and become industry ready.

My Personal Notes arrow_drop_up
Recommended Articles
Page :