Correlation refers to the relationship between two variables. It refers to the degree of linear correlation between any two random variables. This relation can be expressed as a range of values expressed within the interval [-1, 1]. The value -1 indicates a perfect non-linear (negative) relationship, 1 is a perfect positive linear relationship and 0 is an intermediate between neither positive nor negative linear interdependency. However, a value of 0 doesn’t indicate the variables to be independent of each other completely. Correlation Matrices compute the linear relationship degree between a set of random variables, taking one pair at a time and performing for each set of pairs within the data.

**Properties of Correlation Matrices**

- All the diagonal elements of the correlation matrix must be 1 because the correlation of a variable with itself is always perfect, c
_{ii}=1. - It should be symmetric c
_{ij}=c_{ji}.

**Computing Correlation Matrix in R**

In R programming, a correlation matrix can be completed using the cor( ) function, which has the following syntax:

Syntax:cor (x, use = , method = )

Parameters:

x:It is a numeric matrix or a data frame.use:Deals with missing data.

all.obs:this parameter value assumes that the data frame has no missing values and throws an error in case of violation.complete.obs:listwise deletion.pairwise.complete.obs:pairwise deletion.

method:Deals with a type of relationship. Either Pearson, Spearman, or Kendall can be used for computation. The default method used is Pearson.

The correlation matrix can be computed in R after loading the data. The following code snippet indicates the usage of the** cor() **function:

## R

`# loading dataset from the speicified url ` `# storing the data into csv ` ` ` `header = ` `TRUE` `, fileEncoding = ` `"latin1"` `) ` ` ` `# printing the head of the data ` `print ` `(` `"Original Data"` `) ` `head` `(data) ` ` ` `# computing correlation matrix ` `cor_data = ` `cor` `(data) ` ` ` `print` `(` `"Correlation matrix"` `) ` `print` `(cor_data)` |

*chevron_right*

*filter_none*

**Output:**

[1] "Original Data" Year Mileage..thousands. Price 1 1998 27 9991 2 1997 17 9925 3 1998 28 10491 4 1998 5 10990 5 1997 38 9493 6 1997 36 9991 [1] "Correlation matrix" Year Mileage..thousands. Price Year 1.0000000 -0.7480982 0.9343679 Mileage..thousands. -0.7480982 1.0000000 -0.8113807 Price 0.9343679 -0.8113807 1.0000000

**Computing Correlation Coefficients**

R contains an in-built function** rcorr()** which generates the correlation coefficients and a table of p-values for all possible column pairs of a data frame. This function basically computes the significance levels for **Pearson and spearman correlations**.

Syntax:rcorr (x, type = c(“pearson”, “spearman”))

In order to run this function in R, we need to download and load the “**Hmisc**” package into the environment. This can be done in the following way:

install.packages(“Hmisc”)

library(“Hmisc”)

The following code snippet indicates the computation of correlation coefficients in R:

## R

` ` `header = ` `TRUE` `, fileEncoding = ` `"latin1"` `) ` ` ` `# printing the head of the data ` `print` `(` `"Original Data"` `) ` `head` `(data) ` ` ` `# installing the library of Hmisc ` `install.packages` `(` `"Hmisc"` `) ` `library` `(` `"Hmisc"` `) ` ` ` `# computing p values of the data loaded ` `p_values <- ` `rcorr` `(` `as.matrix` `(data)) ` `print` `(p_values)` |

*chevron_right*

*filter_none*

**Output:**

[1] "Original Data" Year Mileage..thousands. Price 1 1998 27 9991 2 1997 17 9925 3 1998 28 10491 4 1998 5 10990 5 1997 38 9493 6 1997 36 9991 Year Mileage..thousands. Price Year 1.00 -0.75 0.93 Mileage..thousands. -0.75 1.00 -0.81 Price 0.93 -0.81 1.00 n= 23 P Year Mileage..thousands. Price Year 0 0 Mileage..thousands. 0 0 Price 0 0

### Visualize a Correlation Matrix

To visualize a correlation matrix refer to these articles:

- Visualize correlation matrix using correlogram in R Programming
- Visualize Correlation Matrix using symnum function in R Programming