Open In App

ML | MultiLabel Ranking Metrics – Coverage Error

Last Updated : 23 Mar, 2020
Improve
Improve
Like Article
Like
Save
Share
Report
The coverage error tells us how many top-scored final prediction labels we have to include without missing any ground truth label. This is useful if we want to know the average number of top-scored-prediction required to predict in order to not miss any ground truth label. Given a binary indicator matrix of ground-truth labels y\epsilon \left \{ 0, 1 \right \}^{n_{samples} * n_{labels}}. The score associated with each label is denoted by \hat{f} where,
 \hat{f}\epsilon \left \{ \mathbb{R} \right \}^{n_{samples} * n_{labels}}
The coverage error is defined as:
 coverage\left ( y, \hat{f} \right ) = \dfrac{1}{n_{samples}} * \sum_{i=0}^{n_{samples}-1}max_{j: y_{ij}=1} rank_{ij}
where rank is defined as
 rank_{ij} = \left | \left \{  k \colon \hat{f_{ik}}\geq\hat{f_{ij}} \right \}\right |
Code: To check for coverage Error for any prediction scores with true-labels using scikit-learn.
# Import dataset
import numpy as np
from sklearn.metrics import coverage_error
  
# Create Imaginary prediction and truth dataset
y_true = np.array([[1, 0, 1], [0, 0, 1], [0, 1, 1]])
y_pred_score = np.array([[0.75, 0.5, 1], [1, 1, 1.2], [2.3, 1.2, 0.1]])
print(coverage_error(y_true, y_pred_score))

                    
Output:
coverage error of 2.0
Let’s calculate the coverage error of above example manually Our first sample has ground-truth value of [1, 0, 1]. To cover both true labels we need to look our predictions (here [0.75, 0.5, 1]) into descending order. Thus, we need top-2 predicted labels in this sample. Similarly for second and third samples, we need top-1 and top-2 predicted samples. Averaging these results over a number of samples gives us an output of 2.0.
  Coverage Error  =\dfrac{\left ( 2+1+3 \right )}{3} = 2.0
The best value of coverage is when it is equal to average number of true class labels.

Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads