Scikit learn is one of the most widely used machine learning libraries in the machine learning community the reason behind that is the ease of code and availability of approximately all functionalities which a machine learning developer will need to build a machine learning model. In this article, we will learn how can we use sklearn to train an MLP model on the handwritten digits dataset. Some of the other benefits are:
- It provides classification, regression, and clustering algorithms such as the SVM algorithm, random forests, gradient boosting, and k-means.
- It is also designed to operate with Python’s scientific and numerical libraries NumPy and SciPy. Scikit-learn is a NumFOCUS project that has financial support.
Importing Libraries and Dataset
Let us begin by importing the model’s required libraries and loading the dataset digits.
Python3
from sklearn import datasets
digits = datasets.load_digits()
dir (digits)
|
Output:
['DESCR', 'data', 'feature_names', 'frame', 'images', 'target', 'target_names']
Function to print a set of image
- digits.image is a three-dimensional array. The first dimension indexes images, and we can see that there are 1797 in total.
- The following two dimensions relate to each image’s pixels’ x and y coordinates.
- Each image is 8×8 = 64 pixels in size. In other terms, this array may be represented in 3D as a stack of 8×8 pixel images.
Output:
[[ 0. 0. 5. 13. 9. 1. 0. 0.]
[ 0. 0. 13. 15. 10. 15. 5. 0.]
[ 0. 3. 15. 2. 0. 11. 8. 0.]
[ 0. 4. 12. 0. 0. 8. 8. 0.]
[ 0. 5. 8. 0. 0. 9. 8. 0.]
[ 0. 4. 11. 0. 1. 12. 7. 0.]
[ 0. 2. 14. 5. 10. 12. 0. 0.]
[ 0. 0. 6. 13. 10. 0. 0. 0.]]
The original digits had much higher resolution, and the resolution was reduced when preparing the dataset for scikit-learn to allow training a machine learning system to recognize these digits easier and faster. Because at such a low resolution, even a human would struggle to recognize some of the digits The low quality of the input photos will also limit our neural network in these settings. Is the neural network capable of doing at least as well as an individual? It would already be an accomplishment!
Python3
import matplotlib.pyplot as plt
def plot_multi(i):
nplots = 16
fig = plt.figure(figsize = ( 15 , 15 ))
for j in range (nplots):
plt.subplot( 4 , 4 , j + 1 )
plt.imshow(digits.images[i + j], cmap = 'binary' )
plt.title(digits.target[i + j])
plt.axis( 'off' )
plt.show()
plot_multi( 0 )
|
Output:
A batch of 16 Hand Written Digits
Training Neural network with the dataset
A neural network is a set of algorithms that attempts to recognize underlying relationships in a batch of data using a technique similar to how the human brain works. In this context, neural networks are systems of neurons that might be organic or artificial in nature.
- an input layer consisting of 64 nodes, one for each pixel in the input pictures They simply send their input value to the neurons of the following layer.
- This is a dense neural network, meaning each node in each layer is linked to all nodes in the preceding and following levels.
The input layer expects a one-dimensional array, whereas the image datasets are two-dimensional. As a result, flattening all images process takes place:
Python3
y = digits.target
x = digits.images.reshape(( len (digits.images), - 1 ))
x.shape
|
Output:
(1797, 64)
The 8×8 images’ two dimensions have been merged into a single dimension by composing the rows of 8 pixels one after the other. The first image, which we discussed before, is now represented as a 1-D array having 8×8 = 64 slots.
Output:
array([ 0., 0., 5., 13., 9., 1., 0., 0., 0., 0., 13., 15., 10.,
15., 5., 0., 0., 3., 15., 2., 0., 11., 8., 0., 0., 4.,
12., 0., 0., 8., 8., 0., 0., 5., 8., 0., 0., 9., 8.,
0., 0., 4., 11., 0., 1., 12., 7., 0., 0., 2., 14., 5.,
10., 12., 0., 0., 0., 0., 6., 13., 10., 0., 0., 0.])
Data split for training and testing.
When machine learning algorithms are used to make predictions based on data that was not used to train the model, the train-test split process is used to measure their performance.
It is a quick and simple technique that allows you to compare the performance of algorithms for machine learning for your predictive modeling challenge.
Python3
x_train = x[: 1000 ]
y_train = y[: 1000 ]
x_test = x[ 1000 :]
y_test = y[ 1000 :]
|
Usage of Multi-Layer Perceptron classifier
MLP stands for multi-layer perceptron. It consists of densely connected layers that translate any input dimension to the required dimension. A multi-layer perception is a neural network with multiple layers. To build a neural network, we connect neurons so that their outputs become the inputs of other neurons.
Python3
from sklearn.neural_network import MLPClassifier
mlp = MLPClassifier(hidden_layer_sizes = ( 15 ,),
activation = 'logistic' ,
alpha = 1e - 4 , solver = 'sgd' ,
tol = 1e - 4 , random_state = 1 ,
learning_rate_init = . 1 ,
verbose = True )
|
Now is the time to train our MLP model on the training data.
Python3
mlp.fit(x_train, y_train)
|
Output:
Iteration 185, loss = 0.01147629
Iteration 186, loss = 0.01142365
Iteration 187, loss = 0.01136608
Iteration 188, loss = 0.01128053
Iteration 189, loss = 0.01128869
Training loss did not improve more than tol=0.000100 for 10 consecutive epochs.
Stopping.
MLPClassifier(activation='logistic', hidden_layer_sizes=(15,),
learning_rate_init=0.1, random_state=1, solver='sgd',
verbose=True)
Above shown is the loss for the last five epochs of the MLPClassifier and its respective configuration.
Python3
fig, axes = plt.subplots( 1 , 1 )
axes.plot(mlp.loss_curve_, 'o-' )
axes.set_xlabel( "number of iteration" )
axes.set_ylabel( "loss" )
plt.show()
|
Output:
Model Evaluation
Now let’s check the performance of the model using the recognition dataset or it just has memorized it. We will do this by using the leftover testing data so, that we can check whether the model has learned the actual pattern in the digit
Python3
predictions = mlp.predict(x_test)
predictions[: 50 ]
|
Output:
array([1, 4, 0, 5, 3, 6, 9, 6, 1, 7, 5, 4, 4, 7, 2, 8, 2, 2, 5, 7, 9, 5,
4, 4, 9, 0, 8, 9, 8, 0, 1, 2, 3, 4, 5, 6, 7, 8, 3, 0, 1, 2, 3, 4,
5, 6, 7, 8, 5, 0])
But the true labels or we can say that the ground truth labels were as shown below.
Output:
array([1, 4, 0, 5, 3, 6, 9, 6, 1, 7, 5, 4, 4, 7, 2, 8, 2, 2, 5, 7, 9, 5,
4, 4, 9, 0, 8, 9, 8, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4,
5, 6, 7, 8, 9, 0])
So, by using the predicted labels and the ground truth labels we can find the accuracy of our model.
Python3
from sklearn.metrics import accuracy_score
accuracy_score(y_test, predictions)
|
Output:
0.9146800501882058
Like Article
Suggest improvement
Share your thoughts in the comments
Please Login to comment...