Related Articles
Correlation Matrix in R Programming
• Last Updated : 23 Oct, 2020

Correlation refers to the relationship between two variables. It refers to the degree of linear correlation between any two random variables. This relation can be expressed as a range of values expressed within the interval [-1, 1]. The value -1 indicates a perfect non-linear (negative) relationship, 1 is a perfect positive linear relationship and 0 is an intermediate between neither positive nor negative linear interdependency. However, a value of 0 doesn’t indicate the variables to be independent of each other completely. Correlation Matrices compute the linear relationship degree between a set of random variables, taking one pair at a time and performing for each set of pairs within the data.

### Properties of Correlation Matrices

1. All the diagonal elements of the correlation matrix must be 1 because the correlation of a variable with itself is always perfect, cii=1.
2. It should be symmetric cij=cji.

### Computing Correlation Matrix in R

In R programming, a correlation matrix can be completed using the cor( ) function, which has the following syntax:

Syntax: cor (x, use = , method =    )

Parameters:

x: It is a numeric matrix or a data frame.
use: Deals with missing data.

• all.obs: this parameter value assumes that the data frame has no missing values and throws an error in case of violation.
• complete.obs: listwise deletion.
• pairwise.complete.obs: pairwise deletion.

method: Deals with a type of relationship. Either Pearson, Spearman, or Kendall can be used for computation. The default method used is Pearson.

The correlation matrix can be computed in R after loading the data. The following code snippet indicates the usage of the cor() function:

## R

 `# loading dataset from the speicified url  ` `# storing the data into csv  ` `data = ``read.csv``(``"https://people.sc.fsu.edu/~jburkardt/data/csv/ford_escort.csv"``,  ` `                ``header = ``TRUE``, fileEncoding = ``"latin1"``) ` ` `  `# printing the head of the data ` `print ``(``"Original Data"``) ` `head``(data) ` ` `  `# computing correlation matrix ` `cor_data = ``cor``(data) ` ` `  `print``(``"Correlation matrix"``) ` `print``(cor_data)`

Output:

``` "Original Data"
Year Mileage..thousands. Price
1 1998                  27  9991
2 1997                  17  9925
3 1998                  28 10491
4 1998                   5 10990
5 1997                  38  9493
6 1997                  36  9991

 "Correlation matrix"
Year Mileage..thousands.      Price
Year                 1.0000000          -0.7480982  0.9343679
Mileage..thousands. -0.7480982           1.0000000 -0.8113807
Price                0.9343679          -0.8113807  1.0000000
```

### Computing Correlation Coefficients

R contains an in-built function rcorr() which generates the correlation coefficients and a table of p-values for all possible column pairs of a data frame. This function basically computes the significance levels for Pearson and spearman correlations.

Syntax:

rcorr (x, type = c(“pearson”, “spearman”))

In order to run this function in R, we need to download and load the “Hmisc” package into the environment. This can be done in the following way:

install.packages(“Hmisc”)

library(“Hmisc”)

The following code snippet indicates the computation of correlation coefficients in R:

## R

 `data = ``read.csv``(``"https://people.sc.fsu.edu/~jburkardt/data/csv/ford_escort.csv"``,  ` `                ``header = ``TRUE``, fileEncoding = ``"latin1"``) ` ` `  `# printing the head of the data ` `print``(``"Original Data"``) ` `head``(data) ` ` `  `# installing the library of Hmisc  ` `install.packages``(``"Hmisc"``) ` `library``(``"Hmisc"``) ` ` `  `# computing p values of the data loaded ` `p_values <- ``rcorr``(``as.matrix``(data)) ` `print``(p_values)`

Output:

``` "Original Data"
Year Mileage..thousands. Price
1 1998                  27  9991
2 1997                  17  9925
3 1998                  28 10491
4 1998                   5 10990
5 1997                  38  9493
6 1997                  36  9991

Year Mileage..thousands. Price
Year                 1.00               -0.75  0.93
Mileage..thousands. -0.75                1.00 -0.81
Price                0.93               -0.81  1.00

n= 23

P
Year Mileage..thousands. Price
Year                      0                   0
Mileage..thousands.  0                        0
Price                0    0                       ```

### Visualize a Correlation Matrix

To visualize a correlation matrix refer to these articles:

My Personal Notes arrow_drop_up
Recommended Articles
Page :