Open In App

Multi-layer Perceptron a Supervised Neural Network Model using Sklearn

An artificial neural network (ANN), often known as a neural network or simply a neural net, is a machine learning model that takes its cues from the structure and operation of the human brain. It is a key element in machine learning’s branch known as deep learning. Interconnected nodes, also referred to as artificial neurons or perceptrons, are arranged in layers to form neural networks. An input layer, one or more hidden layers, and an output layer are examples of these layers. A neural network’s individual neurons each execute a weighted sum of their inputs, apply an activation function to the sum, and then generate an output. The architecture of the network, including the number of layers and neurons in each layer, might vary significantly depending on the particular task at hand. Several machine learning tasks, such as classification, regression, image recognition, natural language processing, and others, can be performed using neural networks because of their great degree of versatility.

In order to reduce the discrepancy between expected and actual outputs, a neural network must be trained by changing the weights of its connections. Optimization techniques like gradient descent are used to do this. In particular, deep neural networks have made significant advances in fields like computer vision, speech recognition, and autonomous driving. Neural networks have demonstrated an exceptional ability to resolve complicated issues. They play a key role in modern AI and machine learning due to their capacity to automatically learn and extract features from data.



Supervised Neural Network models

A supervised neural network model is a type of machine learning model used for tasks where you have labelled data, meaning you know both the input and the corresponding correct output. In this model, you feed input data into layers of interconnected artificial neurons, which process the information and produce an output. During training, the model learns to adjust its internal parameters (weights and biases) to minimize the difference between its predictions and the actual labels in the training data. This process continues until the model can make accurate predictions on new, unseen data. Supervised neural networks are commonly used for tasks like image classification, speech recognition, and natural language processing, where the goal is to map inputs to specific categories or values.

Multi-Layer Perceptron Architecture

MLP (Multi-Layer Perceptron) is a type of neural network with an architecture consisting of input, hidden, and output layers of interconnected neurons. It is capable of learning complex patterns and performing tasks such as classification and regression by adjusting its parameters through training. Let’s explore the architecture of an MLP in detail:



Each neuron applies an activation function to the weighted total of its inputs, whether it is in the input, hidden, or output layer. The sigmoid, hyperbolic tangent (tanh), and rectified linear unit (ReLU) are often used activation functions. The MLP modifies connection (synapse) weights during training using backpropagation and optimization methods like gradient descent. In order to reduce the discrepancy between projected and actual outputs, this method aids the network in learning and fine-tuning its parameters. MLPs are appropriate for a variety of machine learning and deep learning problems, from straightforward to extremely complicated, due to their flexibility in terms of the number of hidden layers, neurons per layer, and choice of activation functions.

MLP Classifier with its Parameters

The MLP Classifier, short for Multi-Layer Perceptron Classifier, is a neural network-based classification algorithm provided by the Scikit-Learn library. It’s a type of feedforward neural network, where information moves in only one direction: forward through the layers. Here’s a detailed explanation of the MLP Classifier and its parameters, which in return collectively define the architecture and behavior of the MLP Classifier :

Implmentation using Iris Dataset

Let’s consider an example where we apply the above explained steps, with the famous Iris dataset or a custom dataset. Below is an example of building and training a neural network to classify iris flowers

Importing Libraries




# Importing required libraries
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neural_network import MLPClassifier
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score

The necessary libraries for using a neural network-based classifier are imported by this code. It contains libraries for performing mathematical operations, dividing data into smaller chunks, scaling features, building MLP (Multi-Layer Perceptron) classifiers, importing the Iris dataset, and assessing the model’s precision.

Loading Dataset




# Loading dataset
iris = load_iris() 
X, y = iris.data, iris.target

Using scikit-learn’s load_iris() function, this program loads the Iris dataset while allocating the feature data to X and the target labels to y. A well-liked dataset for classification problems in machine learning is the Iris dataset.

Splitting Data into Train and Test Sets




# Splitting data set into train & test
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42)

Using train_test_split() from scikit-learn, this code divides the loaded dataset (X and y) into training and testing sets. By fixing the random seed, the random_state option ensures repeatability while the test_size parameter determines the percentage of data to be allotted to the test set (20% in this case).

Feature Scaling




# Creating Object
scaler = StandardScaler() 
# Standardizing the features
X_train = scaler.fit_transform(X_train) 
X_test = scaler.transform(X_test)

To standardize the feature data, this code creates an object scaler called StandardScaler. To standardize the data, the fit_transform() method applies to the training set (X_train) and computes the mean and standard deviation of each feature. Then, using the transform() method, the identical transformation is applied to the test set (X_test), ensuring that both sets are uniformed based on the statistics from the training set. For many machine learning algorithms to work well, features must have sizes that are similar, hence this standardization procedure is crucial.

Model Development




# Creating (MLP) classifier
clf = MLPClassifier(hidden_layer_sizes=(64, 32), max_iter=1000,
                    random_state=42)

The MLPClassifier class from scikit-learn is used in this code to generate an instance of the Multi-Layer Perceptron (MLP) classifier. The neural network’s architecture is specified by the hidden_layer_sizes argument, which is set to a tuple (64, 32), which indicates that there are two hidden layers, each with 64 and 32 neurons. The solver’s maximum number of iterations is indicated by the max_iter parameter, which is set to 1000. For repeatability, random_state is set to 42.

Training the model and Prediction




# Training the model
clf.fit(X_train, y_train)
# Making prediction
y_pred = clf.predict(X_test) 

This code uses the fit method to train an MLP classifier (clf) utilizing standardized training data (X_train) and labels (y_train). Then, using the trained model, predictions are made on the test data (X_test), and the predicted labels are saved in the variable y_pred.

Evaluation of the model




# Determining Accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")

Accuracy: 0.97

The scikit-learn accuracy_score function is used in this code to determine the precision of the MLP classifier’s predictions (y_pred) on the test data (y_test). The final accuracy value, which has two decimal places for reading, is then written to the console.

Conclusion

In conclusion, Scikit-Learn’s MLPClassifier was used to create the supervised neural network model, which is a potent tool for a variety of machine learning applications. This adaptable model provides flexibility in network architecture design and hyperparameter tuning, enabling it to accommodate varied dataset kinds and challenging challenges. Data loading and preprocessing are the first steps in the procedure, which also involve dividing the dataset into training and testing sets and standardizing characteristics to guarantee uniform scaling. Through parameters like hidden_layer_sizes, activation, solver, learning_rate, and max_iter, the MLPClassifier allows for customisation. The network’s capacity, training rate, and convergence behavior are affected by these parameters. Once the model has been trained, it can make predictions on fresh, unobserved data by fitting it to the training data. Its performance is evaluated using criteria like accuracy and F1-score. The supervised neural network model has successfully completed a number of classification tasks, and its versatility, along with meticulous parameter tuning, enables it to perform well in challenging, real-world situations.


Article Tags :