Open In App

Visualize Confusion Matrix Using Caret Package in R

Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we are going to visualize a confusion matrix using a caret package in R programming language.

What is Confusion Matrix?

The Confusion Matrix is a type of matrix that is used to visualize the predicted values against the actual Values. The row headers in the confusion matrix represent predicted values and column headers are used to represent actual values. The Confusion matrix contains four cells as shown in the below image.

Visualize Confusion Matrix Using Caret Package in R

Confusion Matrix

  1. True Negative – Indicates how many negative values are predicted as negative only by the model.
  2. False Positive – Indicates how many negative values are predicted as positive values by the model.
  3. False Negative – Indicates how many positive values are predicted as negative values by the model.
  4. True Positive – Indicates how many positive values are predicted as positive only by the model.

ConfusionMatrix() function

In R Programming the Confusion Matrix can be visualized using confusionMatrix() function which is present in the caret package.

Syntax: confusionMatrix(data, reference, positive = NULL, dnn = c(“Prediction”, “Reference”))

where,

  • data a factor of predicted classes.
  • reference a factor of classes to be used as the true results.
  • positive(optional) an optional character string for the factor level.
  • dnn(optional) a character vector of dimnames for the table.

Create Confusion Matrix in R using confusionMatrix() function

Step 1: First we need to install and load the required package(s). Run the below command in R to install the “caret” package.

# Install the required package
install.packages("caret")

Step 2: Next we need to initialize our predicted and actual data. In our example, we will be using two factors that represent predicted and actual values.

R




# Load the installed package
library(caret)
 
# Initialization of Sample factors
# of predicted and actual values
pred_values <- factor(c(TRUE,FALSE,
       FALSE,TRUE,FALSE,TRUE,FALSE))
actual_values<- factor(c(FALSE,FALSE,
          TRUE,TRUE,FALSE,TRUE,TRUE))


Step 3: Later using confusionMatrix() of the caret package we are going to find and visualize the Confusion Matrix.

R




# Confusion Matrix
cf <- caret::confusionMatrix(data=pred_values,
                     reference=actual_values)
print(cf)


Output:

Confusion Matrix and Statistics

Reference
Prediction FALSE TRUE
FALSE 2 2
TRUE 1 2

Accuracy : 0.5714
95% CI : (0.1841, 0.901)
No Information Rate : 0.5714
P-Value [Acc > NIR] : 0.6531

Kappa : 0.16

Mcnemar's Test P-Value : 1.0000

Sensitivity : 0.6667
Specificity : 0.5000
Pos Pred Value : 0.5000
Neg Pred Value : 0.6667
Prevalence : 0.4286
Detection Rate : 0.2857
Detection Prevalence : 0.5714
Balanced Accuracy : 0.5833

'Positive' Class : FALSE

After running the above code we will get the below output of the confusion matrix using which we can visualize the data of predicted and actual values as we can see in the output we have two “True Positive”, “True Negative”, “False Negative” each and one “False Negative”. As we have given in the data frames of “pred_values” and “actual_values”.

Visualizing Confusion Matrix using fourfoldplot() function

The Confusion Matrix can also be plotted using the built-in fourfoldplot() function in R. The fourfoldplot() function accepts only array type of objects but by default, the caret package is going to produce a confusion matrix which is of type matrix. So, we need to convert the matrix to a table using as.table() function.

Syntax: fourfoldplot(x,color,main)

Where,

  • x – the array or table of size 2X2
  • color – vector of length 2 to specify color for diagonals
  • main – title to be added to the fourfold plot

R




# Visualizing Confusion Matrix
fourfoldplot(as.table(cf),color=c("yellow","pink"),main = "Confusion Matrix")


Output:

gh

Visualize Confusion Matrix Using Caret Package in R

Measuring the performance

We will Measuring the performance of our model using accuracy and using this formula.

Accuracy = (TP + FP) / (P + N)

Confusion matrix using gmodels

To create a confusion matrix using the “gmodels” package in R, we use the CrossTable() function. This function allows us to create a cross-tabulation table, which is essentially a confusion matrix.

R




library(gmodels)
 
# Example data (actual and predicted classes)
actual_values <- c("Positive", "Negative", "Positive", "Negative", "Positive", "Positive")
pred_values <- c("Positive", "Negative", "Positive", "Negative", "Negative", "Positive")
 
# Create a confusion matrix using CrossTable
confusion_matrix <- CrossTable(actual_values, pred_values, prop.chisq = FALSE,
                               prop.t = FALSE, prop.r = FALSE)
 
# Print the confusion matrix
print(confusion_matrix)


Output:

   Cell Contents
|-------------------------|
| N |
| N / Col Total |
|-------------------------|

Total Observations in Table: 6

| pred_values
actual_values | Negative | Positive | Row Total |
--------------|-----------|-----------|-----------|
Negative | 2 | 0 | 2 |
| 0.667 | 0.000 | |
--------------|-----------|-----------|-----------|
Positive | 1 | 3 | 4 |
| 0.333 | 1.000 | |
--------------|-----------|-----------|-----------|
Column Total | 3 | 3 | 6 |
| 0.500 | 0.500 | |
--------------|-----------|-----------|-----------|
(confusion_matrix)
$t
y
x Negative Positive
Negative 2 0
Positive 1 3
$prop.row
y
x Negative Positive
Negative 1.00 0.00
Positive 0.25 0.75
$prop.col
y
x Negative Positive
Negative 0.6666667 0.0000000
Positive 0.3333333 1.0000000
$prop.tbl
y
x Negative Positive
Negative 0.3333333 0.0000000
Positive 0.1666667 0.5000000

In the context of classification tasks the “actual” refers to the true labels of our data.

  • The “predicted” labels are the outcomes of our model predicts.
  • A confusion matrix is a table that shows how well a classification model is performing. It compares the predicted labels with the actual labels. The CrossTable() function is a tool from the “gmodels” package in R that helps generate this matrix.
  • The confusion matrix often includes proportions (percentages) to show the accuracy, precision, recall, etc. However, in this case, these proportions are turned off (prop.chisq, prop.t, and prop.r are set to FALSE).

Visualize Confusion Matrix

R




conf_matrix <- table(actual, predicted)
 
# Visualize the confusion matrix using heatmap
heatmap(conf_matrix,
        main = "Confusion Matrix",
        xlab = "Predicted",
        ylab = "Actual",
        col = heat.colors(10),
        scale = "column",
        margins = c(5, 5))


Output:

gh

Visualize Confusion Matrix

In this example, the table() function is used to create the confusion matrix directly. The heatmap() function then visualizes the matrix.



Last Updated : 19 Dec, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads