Open In App

Classification Using Sklearn Multi-layer Perceptron

Last Updated : 11 Oct, 2023
Like Article

A key machine learning method that belongs to the class of artificial neural networks is classification using Multi-Layer Perceptrons (MLP). It is a flexible and effective method for tackling a variety of classification problems, including text classification and picture recognition. Traditional linear classifiers might not be up to the challenge, but MLPs are known for their capacity to model complicated, non-linear relationships in data. In this article, we’ll look at how to use the popular Python machine learning framework scikit-learn to implement categorization using MLPs.

Architecture and Working of Multi-Layer Perceptron

A Multi-Layer Perceptron (MLP) is a sort of artificial neural network that has multiple layers of connected nodes (also known as neurons) and is frequently used for different machine-learning tasks, including classification and regression. An overview of an MLP’s structure and operation is provided below:


  • Input Layer: The input layer is made up of neurons that directly take in the dataset’s features. Each neuron in the input layer represents a feature, and the input layer’s total number of neurons is equal to the dataset’s total number of features.
  • Hidden Layer: One or more hidden layers may exist between the input and output layers. The number of neurons in each hidden layer, which is a hyperparameter that you can choose, varies depending on the hidden layer. In order to recognize intricate patterns in the data, these hidden layers are essential.
  • Output Layer: The final predictions or outputs are generated by the output layer using the data processed in the hidden levels. The task’s requirements determine how many neurons are present in the output layer:
    • There is often only one neuron that generates a probability score for binary categorization.
    • There are as many neurons involved in multi-class classification as there are classes, and each neuron generates a probability score for a particular class.
    • One neuron produces the continuous projected value for regression problems.


  • Initialization: Set all of the network’s neurons’ weights (W) and biases (B) to their initial values. Usually, modest random numbers are used as initial values for these parameters.
  • Forward Propagation: Input data is passed through the network repeatedly during training. Each neuron in a layer takes in the weighted total of the inputs from the layer before it, applies an activation function, and sends the outcome to the layer after it. The model’s non-linearity is introduced via the activation functions, which enables it to learn intricate correlations.
  • Loss Calculation: A loss (error) is computed by comparing the network’s output to the actual goal values. Mean Squared Error (MSE) for regression and Cross-Entropy for classification are examples of common loss functions.
  • Backpropagation: In order to reduce the loss, the network modifies its biases and weights. The backpropagation algorithm accomplishes this by calculating gradients of the loss with respect to each network parameter. Through optimization techniques like Gradient Descent, these gradients are used to update the weights and biases.
  • Training: The forward propagation, loss estimation, and backpropagation processes are iterated across a number of iterations (epochs) until the model converges to a solution. A hyperparameter that can be modified is the learning rate and the number of iterations.
  • Prediction: By using forward propagation with the honed weights and biases, the MLP may be trained to make predictions on new, unobserved data.

Although MLPs are well renowned for their capacity to represent complicated relationships in data, they can be sensitive to certain hyperparameters, including the number of hidden layers and neurons, the choice of activation functions, and regularization strategies. For MLPs to operate well, proper hyperparameter adjustment is crucial.


To perform regression using the Perceptron algorithm, we need to follow specific steps. Here’s an overview:

Importing Libraries


# Import necessary libraries
from sklearn.neural_network import MLPClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

To create an MLP (Multi-Layer Perceptron) classifier using Scikit-Learn, load the necessary libraries using the code snippet below. It involves importing metrics for model evaluation, including accuracy, classification report, and confusion matrix, as well as loading the Breast Cancer dataset, partitioning the data, standardizing features, and loading the features.

Loading Dataset


# Load the Breast Cancer dataset
cancer_data = load_breast_cancer()
X, y =,

This code uses the load_breast_cancer() function to load the Breast Cancer dataset from Scikit-Learn. The relevant target labels are assigned to variable y, while the feature data is assigned to variable X. To categorize breast cancer tumors as either malignant or benign, this dataset is frequently utilized for binary classification tasks. The target labels (y) reflect the tumors’ corresponding classifications, while the feature data (X) represents the different properties of the tumors.

Splitting dataset into train and test sets


# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42)

Using the train_test_split function from scikit-learn, this snippet of code divides the Breast Cancer dataset into training and testing sets. X_train (training features), X_test (testing features), y_train (training labels), and y_test (testing labels) are four subsets of the X and y variables, which comprise the feature data and target labels. The machine learning model will be trained using the remaining 80% of the data, and the test_size parameter is set to 0.2, which indicates that 20% of the data will be utilized for testing. By setting a constant random seed for the data split, the random_state parameter ensures reproducibility.

Feature Scaling


# Standardize features by removing the mean and scaling to unit variance
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

The Scikit-Learn StandardScaler is used in this code to conduct feature standardization for machine learning. First, it scales to unit variance and removes the mean from the training data (X_train), standardizing the data. After that, the testing data (X_test) receives the same transformation. Standardization makes ensuring that every feature is on a uniform scale, which enhances the performance of machine learning models that use feature magnitudes.

Model Development


# Create an MLPClassifier model
mlp = MLPClassifier(hidden_layer_sizes=(64, 32),
                    max_iter=1000, random_state=42)

The Multi-Layer Perceptron Classifier, or MLPClassifier, is created by this code. It describes the two hidden layers of the neural network’s design, each with 64 and 32 neurons. The random_state option ensures reproducibility of results by seeding random number creation, while the max_iter parameter specifies the maximum number of iterations for the solver to converge during training.

Training and Prediction


# Train the model on the training data, y_train)
# Make predictions on the test data
y_pred = mlp.predict(X_test)
# Calculate the accuracy of the model
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")


Accuracy: 0.97

This code uses the fit technique to train the MLPClassifier model on the training data. It modifies its internal settings to recognize patterns in the data. Next, predictions are formed using the trained model on the test data and are contrasted with the actual labels. By counting the number of predictions that match the actual labels, the accuracy_score function determines the model’s accuracy, and the result is displayed as a percentage.



# Generate a classification report
class_report = classification_report(y_test, y_pred)
print("Classification Report:\n", class_report)

Output: Classification Report

Classification Report:
precision recall f1-score support
0 0.98 0.95 0.96 43
1 0.97 0.99 0.98 71
accuracy 0.97 114
macro avg 0.97 0.97 0.97 114
weighted avg 0.97 0.97 0.97 114

Classification report provides specific performance metrics for each class is prepared.

Advantages of Classification using Multi layer Perceptron

  • Non-Linearity Handling: MLPs are excellent for a variety of classification problems because they are able to simulate complicated, non-linear relationships between features and target classes.
  • Scalability: Due to improvements in technology and libraries (such as TensorFlow and PyTorch), it is now possible to train massive MLPs on enormous datasets, making them useful in a variety of fields.
  • Feature Learning: Since MLPs automatically learn important features from the data, substantial feature engineering is not as necessary.
  • Parallel Processing: On contemporary technology, parallelizing training and inference in MLPs can speed up execution times.

Similar Reads

Multi-layer Perceptron a Supervised Neural Network Model using Sklearn
An artificial neural network (ANN), often known as a neural network or simply a neural net, is a machine learning model that takes its cues from the structure and operation of the human brain. It is a key element in machine learning's branch known as deep learning. Interconnected nodes, also referred to as artificial neurons or perceptrons, are arr
11 min read
Perceptron Algorithm for Classification using Sklearn
Assigning a label or category to an input based on its features is the fundamental task of classification in machine learning. One of the earliest and most straightforward machine learning techniques for binary classification is the perceptron. It serves as the framework for more sophisticated neural networks. This post will examine how to use Scik
11 min read
Multi-Layer Perceptron Learning in Tensorflow
In this article, we will understand the concept of a multi-layer perceptron and its implementation in Python using the TensorFlow library. Multi-layer Perceptron Multi-layer perception is also known as MLP. It is fully connected dense layers, which transform any input dimension to the desired dimension. A multi-layer perception is a neural network
5 min read
Perceptron class in Sklearn
Machine learning is a prominent technology in this modern world and as years go by it is growing immensely. There are several components involved in Machine Learning that make it evolve and solve various problems and one such crucial component that exists is the Perceptron. In this article, we will be learning about what a perceptron is, the histor
11 min read
Multiclass Classification vs Multi-label Classification
Multiclass classification is a machine learning task where the goal is to assign instances to one of multiple predefined classes or categories, where each instance belongs to exactly one class. Whereas multilabel classification is a machine learning task where each instance can be associated with multiple labels simultaneously, allowing for the ass
7 min read
Python Sklearn – sklearn.datasets.load_breast_cancer() Function
In this article, we are going to see how to convert sklearn dataset to a pandas dataframe in Python. Sklearn is a python library that is used widely for data science and machine learning operations. Sklearn library provides a vast list of tools and functions to train machine learning models. The library is available via pip install. pip install sci
2 min read
Classification Metrics using Sklearn
Machine learning classification is a powerful tool that helps us make predictions and decisions based on data. Whether it's determining whether an email is spam or not, diagnosing diseases from medical images, or predicting customer churn, classification algorithms are at the heart of many real-world applications. However, the mere creation of a cl
14 min read
Sklearn | Multi-dimensional Scaling (MDS) Python Implementation from Scratch
Scikit-learn (sklearn) is a Python machine-learning package that is open-source and free to use. It is Python's most popular machine-learning library, and it is extensively used in business and academics. Scikit-learn includes a wide range of machine learning methods, including supervised learning (classification, regression), unsupervised learning
10 min read
Layer 4 Load Balancing vs. Layer 7 Load Balancing
Load balancing is the process of distributing incoming network traffic or computational workloads across multiple servers, resources, or processes in a network. The primary goal of load balancing is to optimize resource utilization, maximize throughput, minimize response time, and avoid overload on any individual server or resource. Important topic
5 min read
Fully Connected Layer vs Convolutional Layer
Confusion between Fully Connected Layers (FC) and Convolutional Layers is common due to terminology overlap. In CNNs, convolutional layers are used for feature extraction followed by FC layers for classification that makes it difficult for beginners to distinguish there roles. This article compares Fully Connected Layers (FC) and Convolutional Laye
4 min read