Open In App

How To Create/Customize Your Own Scorer Function In Scikit-Learn?

Last Updated : 31 Jul, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

A well-known Python machine learning toolkit called Scikit-learn provides a variety of machine learning tools and methods to assist programmers in creating sophisticated machine learning models. A strong framework for assessing the effectiveness of these models using a variety of metrics and scoring functions is also offered by Scikit-learn. To assess the effectiveness of their models, users might want to design their scoring function in specific circumstances. Scikit-learn makes this possible, and in this article, we’ll go over how to design and tweak your very own scoring function.

A scikit-learn function called a scorer accepts two arguments: the ground truth (actual values) and the model’s predicted values. A single score that evaluates the accuracy of the anticipated values is returned by the function. Accuracy, precision, recall, F1-score, and other predefined scoring functions are available in Scikit-learn. To assess the effectiveness of their models, users might want to develop their unique scoring system.

Custom scorer for a multi-class Regression problem

To create a custom scorer function in sci-kit-learn, we need to follow some steps:

Step 1: Create a custom function that evaluates the accuracy

create a Python function that accepts two arguments: the model’s predicted values and the ground truth (actual values). A single score that evaluates the accuracy of the anticipated values should be returned by the function.

Here I am defining the coefficient of determination (R2)

The coefficient of determination (R²) is a statistical measure that represents how well a statistical model predicts an outcome. It measures the proportion of variance in the predicted output that is explained by the independent input variable(s) in a regression model. 

R^2 = 1- \frac{RSS}{TSS}

Here,

  • RSS = Sum of Squared error also known as Residual sum of squares (RSS) measures the variation that is not explained by the regression model.  It is the sum of squared differences between the predicted values and the actual target values.

RSS = \sum(pred-actual)^2

  • TSS = total sum of squares (TSS) represents the total variation in the dependent variable. It is the sum of squared differences between the actual values and the mean of the dependent variable

TSS = \sum (actual-mean)^2

The value of R² ranges from 0 to 1, with higher values indicating a better fit. A value of 0 indicates that the regression line does not fit the data at all, while a value of 1 indicates a perfect fit.

Python3

import numpy as np
 
def r_squared(y_true, y_pred):
    # Calculate the mean of the true values
    mean_y_true = np.mean(y_true)
 
    # Calculate the sum of squares of residuals and total sum of squares
    ss_res = np.sum((y_true - y_pred) ** 2)
    ss_tot = np.sum((y_true - mean_y_true) ** 2)
 
    # Calculate R²
    r2 = 1 - (ss_res / ss_tot)
 
    return r2

                    

Step 2:Create a scorer object:

Once the scoring function has been constructed, a scorer object must be created using the sci-kit-learn make_scorer() function. The scoring function is passed as an argument to the make_scorer() function, which returns a scorer object.

Python3

from sklearn.metrics import make_scorer
# Create a scorer object using the r_squared function
r2_score = make_scorer(r2_squared)
r2_score

                    

Output:

make_scorer(r2_squared)

Step 3: Implementations of the above-defined scorer object

After creating the scorer object, we can use it to access a machine learning model’s performance using the cross-validation functions for different subsets of datasets provided by scikit-learn or other model assessment tools.

Python3

from sklearn.datasets import fetch_california_housing
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import cross_val_score
 
# Load the California Housing Price dataset
X, y = fetch_california_housing(return_X_y=True)
 
# Create a Random Forest regression model
model = RandomForestRegressor()
 
 
# Evaluate the performance of the model u
# sing cross-validation with the r2_squared function
scores = cross_val_score(model,
                         X, y,
                         cv=5,
                         scoring=r2_score)
 
# Print the mean and standard deviation of the scores
print(f"R2 Squared: {scores.mean():.2f} +/- {scores.std():.2f}")

                    

Output:

R2 Squared: 0.65 +/- 0.08

Custom scorer for a multi-class classification problem

Steps:

  • Import the necessary libraries 
  • Load the iris dataset
  • Define multiple metrics like accuracy_score, precision_score, recall_score, f1_score with make_scorer.
  • Create a XGBClassifier model
  • Evaluate the model using cross-validation and the custom scorer
  • Print the mean scores for each metric
     

Python

from sklearn.metrics import make_scorer, accuracy_score
from sklearn.metrics import precision_score, recall_score, f1_score
from sklearn.model_selection import cross_validate
from sklearn.datasets import load_iris
from xgboost import XGBClassifier
 
 
# Load the iris dataset
iris = load_iris()
 
# Define multiple metrics
scoring = {'accuracy': make_scorer(accuracy_score),
           'precision': make_scorer(precision_score, average='macro'),
           'recall': make_scorer(recall_score, average='macro'),
           'f1-score': make_scorer(f1_score, average='macro')
          }
 
# Create a XGBClassifier
clf = XGBClassifier(n_estimators=2,
                    max_depth=3,
                    learning_rate=0.1)
 
# Evaluate the model using cross-validation and the custom scorer
scores = cross_validate(clf, iris.data, iris.target, cv=5, scoring=scoring)
 
# Print the mean scores for each metric
print("Accuracy mean score:", scores['test_accuracy'].mean())
print("Precision mean score:", scores['test_precision'].mean())
print("Recall mean score:", scores['test_recall'].mean())
print("f1-score:", scores['test_f1-score'].mean())

                    

Output:

Accuracy mean score: 0.9666666666666668
Precision mean score: 0.9707070707070707
Recall mean score: 0.9666666666666668
f1-score: 0.9664818612187034


 



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads