Generally, for any classification problem, we predict the class value that has the highest probability of being the true class label. However, sometimes, we want to predict the **probabilities of a data instance belonging to each class label**. For example, say we are building a model to classify fruits and we have three class labels: apples, oranges, and bananas (each fruit is one of these). For any fruit, we want the probabilities of the fruit being an apple, an orange, or a banana.

This is very useful for the evaluation of a classification model. It can help us understand how ‘sure’ a model is while predicting a class label and may help us interpret how decisive a classification model is. Generally, classifiers that have a linear probability of predicting each class’s labels are called **calibrated**. The problem is, not all classification models are calibrated.

Some models can give poor estimates of class probabilities and some do not even support probability prediction.

**Calibration Curves: **

Calibration curves are used to evaluate how calibrated a classifier is i.e., how the probabilities of predicting each class label differ. The x-axis represents the average predicted probability in each bin. The y-axis is the ratio of positives (the proportion of positive predictions). The curve of the **ideal calibrated model** is a linear straight line from (0, 0) moving linearly.

**Plotting Calibration Curves in Python3: **

For this example, we will use a binary dataset. We will use the popular diabetes dataset. You can learn more about this dataset here.

**Code: Implementing a Support Vector Machine’s calibration curve and compare it with a perfectly calibrated model’s curve. **

`# Importing required modules ` ` ` `from` `sklearn.datasets ` `import` `load_breast_cancer ` `from` `sklearn.svm ` `import` `SVC ` `from` `sklearn.model_selection ` `import` `train_test_split ` `from` `sklearn.calibration ` `import` `calibration_curve ` `import` `matplotlib.pyplot as plt ` ` ` `# Loading dataset ` `dataset ` `=` `load_breast_cancer() ` `X ` `=` `dataset.data ` `y ` `=` `dataset.target ` ` ` `# Splitting dataset into training and testing sets ` `X_train, X_test, y_train, y_test ` `=` `train_test_split(X, y, ` ` ` `test_size ` `=` `0.1` `, random_state ` `=` `13` `) ` ` ` `# Creating and fitting model ` `model ` `=` `SVC() ` `model.fit(X_train, y_train) ` ` ` `# Predict Probabilities ` `prob ` `=` `model.decision_function(X_test) ` ` ` `# Creating Calibration Curve ` `x, y ` `=` `calibration_curve(y_test, prob, n_bins ` `=` `10` `, normalize ` `=` `True` `) ` ` ` `# Plot calibration curve ` ` ` `# Plot perfectly calibrated ` `plt.plot([` `0` `, ` `1` `], [` `0` `, ` `1` `], linestyle ` `=` `'--'` `, label ` `=` `'Ideally Calibrated'` `) ` ` ` `# Plot model's calibration curve ` `plt.plot(y, x, marker ` `=` `'.'` `, label ` `=` `'Support Vector Classifier'` `) ` ` ` `leg ` `=` `plt.legend(loc ` `=` `'upper left'` `) ` `plt.xlabel(` `'Average Predicted Probability in each bin'` `) ` `plt.ylabel(` `'Ratio of positives'` `) ` `plt.show() ` |

*chevron_right*

*filter_none*

**Output: **

From the graph, we can clearly see that the Suppor Vector classifier is nor very well calibrated. The closes a model’s curve is to the perfect calibrated model’s curve (dotted curve), the better calibrated it is.

**Conclusion: **

Now that you know what calibration is in terms of Machine Learning and how to plot a calibration curve, next time you classifier gives unpredictable results and you can’t find the cause, try plotting the calibration curve and check if the model is well-calibrated.

## Recommended Posts:

- Rhodonea Curves and Maurer Rose in Python
- Make filled polygons between two curves in Python using Matplotlib
- Make filled polygons between two horizontal curves in Python using Matplotlib
- Plotting polar curves in Python
- Using Learning Curves - ML
- Python Bokeh - Plotting Quadratic Curves on a Graph
- How to plot Andrews curves using Pandas in Python?
- How to make Dropdown Menus in Plotly?
- RPA Life Cycle
- How to check whether specified values are present in NumPy array?
- PyQtGraph – Resize Plot Window
- PyQtGraph – Getting Minimum Width of Plot Window
- PYGLET – Accessing Selection Color Property of Incremental Text Layout
- PyQtGraph - Extensive Examples

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.