Generally, for any classification problem, we predict the class value that has the highest probability of being the true class label. However, sometimes, we want to predict the probabilities of a data instance belonging to each class label. For example, say we are building a model to classify fruits and we have three class labels: apples, oranges, and bananas (each fruit is one of these). For any fruit, we want the probabilities of the fruit being an apple, an orange, or a banana.
This is very useful for the evaluation of a classification model. It can help us understand how ‘sure’ a model is while predicting a class label and may help us interpret how decisive a classification model is. Generally, classifiers that have a linear probability of predicting each class’s labels are called calibrated. The problem is, not all classification models are calibrated.
Some models can give poor estimates of class probabilities and some do not even support probability prediction.
Calibration curves are used to evaluate how calibrated a classifier is i.e., how the probabilities of predicting each class label differ. The x-axis represents the average predicted probability in each bin. The y-axis is the ratio of positives (the proportion of positive predictions). The curve of the ideal calibrated model is a linear straight line from (0, 0) moving linearly.
Plotting Calibration Curves in Python3:
For this example, we will use a binary dataset. We will use the popular diabetes dataset. You can learn more about this dataset here.
Code: Implementing a Support Vector Machine’s calibration curve and compare it with a perfectly calibrated model’s curve.
From the graph, we can clearly see that the Suppor Vector classifier is nor very well calibrated. The closes a model’s curve is to the perfect calibrated model’s curve (dotted curve), the better calibrated it is.
Now that you know what calibration is in terms of Machine Learning and how to plot a calibration curve, next time you classifier gives unpredictable results and you can’t find the cause, try plotting the calibration curve and check if the model is well-calibrated.
- Rhodonea Curves and Maurer Rose in Python
- Make filled polygons between two curves in Python using Matplotlib
- Make filled polygons between two horizontal curves in Python using Matplotlib
- Plotting polar curves in Python
- Using Learning Curves - ML
- Python Bokeh - Plotting Quadratic Curves on a Graph
- How to plot Andrews curves using Pandas in Python?
- How to make Dropdown Menus in Plotly?
- RPA Life Cycle
- How to check whether specified values are present in NumPy array?
- PyQtGraph – Resize Plot Window
- PyQtGraph – Getting Minimum Width of Plot Window
- PYGLET – Accessing Selection Color Property of Incremental Text Layout
- PyQtGraph - Extensive Examples
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.