Correlation Matrix in R Programming
Last Updated :
24 Nov, 2023
Correlation refers to the relationship between two variables. It refers to the degree of linear correlation between any two random variables. This Correlation Matrix in R can be expressed as a range of values expressed within the interval [-1, 1]. The value -1 indicates a perfect non-linear (negative) relationship, 1 is a perfect positive linear relationship and 0 is an intermediate between neither positive nor negative linear interdependency. Hoindependent of each other completely. Correlation Matrix in R computes the linear relationship degree between a set of random variables, taking one pair at a time and performing for each set of pairs within the data.
Properties of Correlation Matrix in R
- All the diagonal elements of the Correlation Matrix in R must be 1 because the correlation of a variable with itself is always perfect, cii=1.
- It should be symmetric cij=cji.
Computing Correlation Matrix in R
In R Programming Language, a correlation matrix can be completed using the cor( ) function, which has the following syntax:
Syntax: cor (x, use = , method = )
Parameters:
x: It is a numeric matrix or a data frame.
use: Deals with missing data.
- all.obs: this parameter value assumes that the data frame has no missing values and throws an error in case of violation.
- complete.obs: listwise deletion.
- pairwise.complete.obs: pairwise deletion.
method: Deals with a type of relationship. Either Pearson, Spearman, or Kendall can be used for computation. The default method used is Pearson.
Correlation in R Programming Language
The Correlation Matrix in R is done after loading the data. The following code snippet indicates the usage of the cor() function:
R
header = TRUE , fileEncoding = "latin1" )
print ( "Original Data" )
head (data)
cor_data = cor (data)
print ( "Correlation matrix" )
print (cor_data)
|
Output:
[1] "Original Data"
Year Mileage..thousands. Price
1 1998 27 9991
2 1997 17 9925
3 1998 28 10491
4 1998 5 10990
5 1997 38 9493
6 1997 36 9991
[1] "Correlation matrix"
Year Mileage..thousands. Price
Year 1.0000000 -0.7480982 0.9343679
Mileage..thousands. -0.7480982 1.0000000 -0.8113807
Price 0.9343679 -0.8113807 1.0000000
Computing Correlation Coefficients of Correlation Matrix in R
R contains an in-built function rcorr() which generates the correlation coefficients and a table of p-values for all possible column pairs of a data frame. This function basically computes the significance levels for Pearson and spearman correlations.
Syntax: rcorr (x, type = c(“pearson”, “spearman”))
In order to run this function in R, we need to download and load the “Hmisc” package into the environment. This can be done in the following way:
install.packages(“Hmisc”)
library(“Hmisc”)
The following code snippet indicates the computation of correlation coefficients in R:
R
header = TRUE , fileEncoding = "latin1" )
print ( "Original Data" )
head (data)
install.packages ( "Hmisc" )
library ( "Hmisc" )
p_values <- rcorr ( as.matrix (data))
print (p_values)
|
Output:
[1] "Original Data"
Year Mileage..thousands. Price
1 1998 27 9991
2 1997 17 9925
3 1998 28 10491
4 1998 5 10990
5 1997 38 9493
6 1997 36 9991
Year Mileage..thousands. Price
Year 1.00 -0.75 0.93
Mileage..thousands. -0.75 1.00 -0.81
Price 0.93 -0.81 1.00
n= 23
P
Year Mileage..thousands. Price
Year 0 0
Mileage..thousands. 0 0
Price 0 0
Visualize a Correlation Matrix in R
In R, we shall use the “corrplot” package to implement a correlogram. Hence, to install the package from the R Console we should execute the following command:
install.packages("corrplot")
Once we have installed the package properly, we shall load the package in our R script using the library() function as follows:
library("corrplot")
We will use the corrplot() function and mention the shape in its method arguments.
R
library (corrplot)
head (mtcars)
M<- cor (mtcars)
head ( round (M,2))
corrplot (M, method= "circle" )
corrplot (M, method= "pie" )
corrplot (M, method= "color" )
corrplot (M, method= "number" )
|
Output:
Visualize Correlogram as a pie chart
R
corrplot (M, method= "pie" )
|
Output:
Visualize Correlogram as colored rectangles
R
corrplot (M, method= "color" )
|
Output:
Visualize Correlogram as numbers
R
corrplot (M, method= "number" )
|
Output:
Visualize Correlogram as 3D Scatter Plot
R
corrplot (correlation_matrix, method= "ellipse" )
|
Output:
Correlation Matrix in R Programming
Visualize Correlogram as Density Plot
R
corrplot (M, method= "shade" )
|
Output:
Correlation Matrix in R Programming
We can choose the visualization method that best suits your needs or preferences. The corrplot
package provides various customization options for each visualization method.
Like Article
Suggest improvement
Share your thoughts in the comments
Please Login to comment...