Skip to content
Related Articles

Related Articles

Improve Article

Visualize Correlation Matrix using symnum function in R Programming

  • Last Updated : 25 Sep, 2020

Correlation refers to the relationship between two variables. It refers to the degree of linear correlation between any two random variables. This relation can be expressed as a range of values expressed within the interval [-1, 1]. The value -1 indicates a perfect non-linear (negative) relationship, 1 is a perfect positive linear relationship and 0 is an intermediate between neither positive nor negative linear inter-dependency. However, a value of 0 doesn’t indicate the variables to be independent. Correlation matrices compute the linear relationship degree between a set of random variables, taking a pair into account at a time.

Properties of Correlation Matrix 

  1. All the diagonal elements of the correlation matrix must be 1 because the correlation of a variable with itself is always perfect, e.g Cii = 1
  2. It should be symmetric e.g Cij = Cji

Implementation in R

R has an in-built function, symnum( ) which can be used to easily visualize the degree of correction among various variables. It can segregate highly correlated variables quite easily from others. The correlation coefficients are replaced by symbols based on the degree of relation. In R, symnum( ) function has the following syntax:

Syntax:

symnum(arr, cutpoints = c (0.3, 0.6, 0.8, 0.9, 0.95), symbols = c (” “, “.”, “,”, “+”, “*”, “B”))

Parameter:



 arr = logical or numerical array

cutpoints = correlation coefficients cutpoints, for eg, coefficient between 0.3-0.6 are replaced by (“.”). Diagonal elements are replaced by 1.

symbols = an array of symbols to denote values of correlation coefficients with the number of symbols is always 1 greater than the cutpoints.

Note: The correlation coefficients in the arr must be between -1 and 1.

Visualization of uni-dimensional numeric array:

R




# defining a single dimension array
arr <- c(6, 4, 3, 2, 5, 1, 8, 7)
  
# cut values are determined at an interval of 2
# symbols are specified by sym 
symnum(arr, cut = c(0, 2, 4, 6, 8), 
       sym = c(".", "-", "+", "$"))

Output:

[1] + - - . + . $ $
attr(,"legend")
[1] 0 '.' 2 '-' 4 '+' 6 '$' 8

Explanation: All the values in the range of 0 – 2, inclusive of 2 are denoted by a “.”, similarly in the range of 6-8 are denoted by “$”. Therefore the output indicates the denotion of arr values based on cutpoints and symbols.
 



Visualization of the uni-dimensional logical array:
The following code snippet indicates the application of symnum( ) function on the logical array.

R




# the logical condition is a parameter 
# to the symnum function 
# the default values assigned are | symbol 
# for true values and . for false values 
symnum(1:7 %% 2 == 0)

Output: 

[1] . | . | . | .

Explanation: The array values are evaluated based on the condition of array % 2 and the corresponding result is evaluated to a symbol, either “1” based on TRUE and “.” based on FALSE.

Visualizing correlation matrices in R

Correlation matrices can be easily created by the cor( ) function.

Syntax: cor (x; use =  )

Parameter:  

x: numeric matrix or a data frame use deals with missing values.

This function outputs a matrix of correlation coefficients which can then be fed into the symnum( ) function to focus on the highly correlated values, from the symbols specified in the symbols array parameter of the function. 

R




# R program to illustrate
# Coorelation Matrix
  
# a correlation matrix is defined by cor() function
mat <- cor(matrix(rnorm(30), 10, 3));
  
# printing the correlation matrix mat
print("Correlation matrix")
print(mat)
  
# visualising the relation between various elements 
print("Symbolic symnum representation")
symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"))

Output: 

[1] "Correlation matrix"
> print (mat)
         [,1]      [,2]      [,3]
[1,] 1.0000000 0.1295918 0.1137502
[2,] 0.1295918 1.0000000 0.2967970
[3,] 0.1137502 0.2967970 1.0000000
> #visualising the relation between various elements
> print ("Symbolic symnum representation")
[1] "Symbolic symnum representation"
> symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"))
           
[1,] 1      
[2,] |  1  
[3,] |  |  1
attr(,"legend")
[1] 0 '| ' 0.3 '.' 0.6 ',' 0.8 '+' 0.9 '*' 0.95 'B' 1

Explanation: A lower diagonal matrix is printed by default, where values indicate the symbols depicting the degree of relation.

There are various parameters that can be varied in the symnum() function. The following code snippet indicates the usage of params:

R




# R program to illustrate
# Coorelation Matrix
  
# a correlation matrix is defined by cor() function
mat <- cor(matrix(rexp(30, 1), 5, 5));
  
# printing the correlation matrix mat
print("Correlation matrix")
print(mat)
  
# visualising the relation between various 
# elements without diagonal elements
print("Symbolic symnum representation with false diagonal")
symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"), 
       diag = FALSE)
  
# setting lower = false 
print("COmplete symnum matrix ")
symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"),
       lower = FALSE)

Output: 

[1] "Correlation matrix"
> print (mat)
          [,1]        [,2]       [,3]       [,4]        [,5]
[1,]  1.0000000 -0.39983276 -0.5533282 -0.2420029  0.15030025
[2,] -0.3998328  1.00000000  0.2561824 -0.2090551 -0.05073241
[3,] -0.5533282  0.25618240  1.0000000 -0.6360808 -0.90394274
[4,] -0.2420029 -0.20905508 -0.6360808  1.0000000  0.86086867
[5,]  0.1503003 -0.05073241 -0.9039427  0.8608687  1.00000000
> #visualising the relation between various elements
> print ("Symbolic symnum representation with false diagonal")
[1] "Symbolic symnum representation with false diagonal"
> symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"),diag=FALSE)
             
[1,]          
[2,] .        
[3,] .  |      
[4,] |  |  ,  
[5,] |  |  * +
attr(,"legend")
[1] 0 '| ' 0.3 '.' 0.6 ',' 0.8 '+' 0.9 '*' 0.95 'B' 1
> print ("COmplete symnum matrix ")
[1] "COmplete symnum matrix "
> symnum(mat, symbols = c("| ", ".", ",", "+", "*", "B"),lower=FALSE)
                 
[1,] 1  .  .  |  |
[2,] .  1  |  |  |
[3,] .  |  1  ,  *
[4,] |  |  ,  1  +
[5,] |  |  *  +  1
attr(,"legend")
[1] 0 '| ' 0.3 '.' 0.6 ',' 0.8 '+' 0.9 '*' 0.95 'B' 1

Explanation: diag = FALSE doesn’t print diagonal element values, that is 1, indicating perfect correlation. LOWER = FALSE, helps us visualize the complete matrix instead of just lower diagonal ones.




My Personal Notes arrow_drop_up
Recommended Articles
Page :