Covariance and Correlation in R Programming

Covariance and Correlation are terms used in statistics to measure relationships between two random variables. Both of these terms measure linear dependency between a pair of random variables or bivariate data.
In this article, we are going to discuss cov(), cor() and cov2cor() functions in R which use covariance and correlation methods of statistics and probability theory.

Covariance

In R programming, covariance can be measured using cov() function. Covariance is a statistical term used to measures the direction of the linear relationship between the data vectors. Mathematically,

  \operatorname{Cov}(x, y)=\frac{\Sigma\left(x_{i}-\bar{x}\right)\left(y_{i}-\bar{y}\right)}{N}

where,

x represents the x data vector
y represents the y data vector
\bar{x} represents mean of x data vector
\bar{y} represents mean of y data vector
N represents total obeservations



Syntax:

cov(x, y, method)

where,

x and y represents the data vectors
method defines the type of method to be used to compute covariance. Default is "pearson".

Example:

filter_none

edit
close

play_arrow

link
brightness_4
code

# Data vectors
x <- c(1, 3, 5, 10)
  
y <- c(2, 4, 6, 20)
  
# Print covariance using different methods
print(cov(x, y))
  
print(cov(x, y, method = "pearson"))
  
print(cov(x, y, method = "kendall"))
  
print(cov(x, y, method = "spearman"))

chevron_right


Output:

[1] 30.66667
[1] 30.66667
[1] 12
[1] 1.666667

Correlation

cor() function in R programming measures the correlation coefficient value. Correlation is a relationship term in statistics that uses the covariance method to measure how strong the vectors are related. Mathematically,

 \operatorname{Corr}(x, y)=\frac{\sum\left(x_{i}-\bar{x}\right)\left(y_{i}-\bar{y}\right)}{\sqrt{\sum\left(x_{i}-\bar{x}\right)^{2} \sum\left(y_{i}-\bar{y}\right)^{2}}}

where,

x represents the x data vector
y represents the y data vector
\bar{x} represents mean of x data vector
\bar{y} represents mean of y data vector



Syntax:

cor(x, y, method)

where,

x and y represents the data vectors
method defines the type of method to be used to compute covariance. Default is "pearson".

Example:

filter_none

edit
close

play_arrow

link
brightness_4
code

# Data vectors
x <- c(1, 3, 5, 10)
  
y <- c(2, 4, 6, 20)
  
# Print correlation using different methods
print(cor(x, y))
  
print(cor(x, y, method = "pearson"))
  
print(cor(x, y, method = "kendall"))
  
print(cor(x, y, method = "spearman"))

chevron_right


Output:

[1] 0.9724702
[1] 0.9724702
[1] 1
[1] 1

Conversion of Covariance to Correlation

cov2cor() function in R programming converts a covariance matrix into corresponding correlation matrix.

Syntax:

cov2cor(X)

where,
X
and y represents the covariance square matrix

Example:

filter_none

edit
close

play_arrow

link
brightness_4
code

# Data vectors
x <- rnorm(2)
y <- rnorm(2)
  
# Binding into square matrix
mat <- cbind(x, y)
  
# Defining X as the covariance matrix
X <- cov(mat)
  
# Print covariance matrix
print(X)
  
# Print correlation matrix of data vector
print(cor(mat))
  
# Using function cov2cor()
# To convert covariance matrix to correlation matrix
print(cov2cor(X))

chevron_right


Output:

           x          y
x  0.0742700 -0.1268199
y -0.1268199  0.2165516

   x  y
x  1 -1
y -1  1

   x  y
x  1 -1
y -1  1




My Personal Notes arrow_drop_up

Blockchain Enthusiast

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.


Article Tags :

Be the First to upvote.


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.