In this article, we will be looking into the Datapoint editor tab of the What-if tool, using the same model that we used in the What-if introduction article. The Datapoint Editor predicts whether a house is worth more than $160,000.
When you initialize the What-If Tool you’re brought right into the Datapoint editor tab. This plots all of our test data along with the model’s prediction for each data point. For the data points towards the top, our model is pretty high confidence that the house is worth $160,000. For the data points towards the bottom, it has high confidence that the house is worth less than $160,000. Our model is less confident in the data points towards the middle.
Let’s see what happens when we click on an individual data point. Here we can see all the feature values for this data point.
We can also change any of the values to see how this affects our model’s prediction. The next capability we’ll look at is partial dependence plots. With a data point selected, we can see how different features contribute to the model’s prediction for this data point, or for all the data points in our data set. These are calculated by having the model predicted using the selected data point while changing the value of only one feature at a time and plotting the results. The dot in this graph is the data point we’ve selected. And the line indicates the effect this feature has on our model.
For example, here the later the house was built the more likely it is to be priced higher. Houses with unfinished garages are likely to be priced lower by our model. And houses with a fireplace are likely to be priced higher.
There’s, even more, you can do with the Datapoint editor. If we go back and select a data point and click Show nearest counterfactual we can see the data point that had features most similar to the initial one we’ve selected but with the opposite prediction. In the editor panel on the left, the What-If Tool highlights the difference between these two data points in green and bold. So we can see what might have caused the opposite prediction.
We can also make all sorts of custom visualizations of our data points making use of, any of the features of the data points or the model results. Here we’ve created a histogram showing the distribution of houses by square feet, with the points colored by the model’s predictions.
Attention reader! Don’t stop learning now. Get hold of all the important CS Theory concepts for SDE interviews with the CS Theory Course at a student-friendly price and become industry ready.