Open In App

Restricted Boltzmann Machine (RBM) with Practical Implementation

Last Updated : 26 Dec, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

In the world of machine learning, one algorithm that has gained significant attention is the Restricted Boltzmann Machine (RBM). RBMs are powerful generative models that have been widely used for various applications, such as dimensionality reduction, feature learning, and collaborative filtering. In this article, we will explore the concepts and steps involved in training and using RBMs, along with some good examples to solidify our understanding.

What is a Restricted Boltzmann Machine?

A Restricted Boltzmann Machine is a type of artificial neural network that falls under the category of generative models. It was introduced by Geoffrey Hinton and Terry Sejnowski in the 1980s. RBMs consist of two layers: visible units and hidden units. The units within each layer are fully connected, but there are no connections between units within the same layer.

RBMs are called “restricted” because of the restrictions imposed on the connections between units. This restriction ensures that the visible units are only connected to the hidden units and vice versa, making RBMs a bipartite graph. This architectural constraint simplifies the training and inference procedures of RBMs.

How does an RBM work?

The working of an RBM can be divided into two main steps: training and inference.

Training an RBM

The training of an RBM involves adjusting the weights and biases to maximize the likelihood of the training data. This is done using a technique called Contrastive Divergence (CD). The CD algorithm compares the activations of the visible and hidden units in the RBM to update the weights and biases iteratively.

The training process starts by initializing the weights and biases randomly. Then, a training sample is presented to the RBM, and the activations of the hidden units are computed using the current weights and biases. Next, the activations of the hidden units are used to reconstruct the visible units, and the process is repeated for a few steps to obtain the reconstructed visible units. Finally, the updates to the weights and biases are computed based on the difference between the original visible units and the reconstructed visible units.

Inference with an RBM

Once the RBM is trained, it can be used for inference tasks such as generating new samples or performing classification. For generating new samples, the RBM starts with a random configuration of the visible units and then iteratively updates the hidden units and reconstructed visible units. This process allows the RBM to generate new samples that are similar to the training data.

For classification tasks, RBMs can be used as feature extractors. The hidden units can be viewed as a compressed representation of the input data, capturing the most relevant features. These features can then be fed into another classifier, such as a logistic regression model, to perform the actual classification task.

Steps to Train a Restricted Boltzmann Machine

Training a Restricted Boltzmann Machine involves several steps. Let’s walk through each of these steps in detail:

  1. Data Preprocessing: Before training an RBM, it is essential to preprocess the data. This may include normalization, scaling, or any other preprocessing techniques specific to the dataset.
  2. Initializing the RBM: The RBM is initialized by randomly assigning weights and biases to its connections. The weights and biases can be sampled from a Gaussian distribution or any other suitable distribution.
  3. Computing Hidden Unit Activations: Given a training sample, the activations of the hidden units are computed using the current weights and biases. This is done by applying the sigmoid activation function to the weighted sum of the visible units connected to each hidden unit.
  4. Sampling Hidden Units: Once the hidden unit activations are computed, the hidden units are sampled based on their activations. This is done by comparing the activations to random numbers drawn from a uniform distribution.
  5. Computing Reconstructed Visible Units: The reconstructed visible units are computed using the activations of the hidden units and the corresponding weights and biases. This is similar to the computation of the hidden unit activations, but in the opposite direction.
  6. Updating the Weights and Biases: The weights and biases of the RBM are updated based on the difference between the original visible units and the reconstructed visible units. This update is usually done using a learning rate that controls the magnitude of the weight and bias updates.
  7. Repeating Steps 3-6: Steps 3-6 are repeated for multiple iterations or until a convergence criterion is met. This allows the RBM to learn the underlying patterns in the training data and adjust its weights and biases accordingly.

Steps Needed:

Implementing RBMs with Sklearn involves several steps:

  1. Data Preprocessing: Prepare your dataset by cleaning, normalizing, and standardizing it as required.
  2. RBMs Configuration: Set hyperparameters such as the number of visible and hidden units, learning rate, number of training epochs, and batch size.
  3. Model Initialization: Initialize the RBM model using Sklearn’s BernoulliRBM class, specifying the hyperparameters defined in step 2.
  4. Model Training: Train the RBM model using your preprocessed data. Sklearn provides a fit method for this purpose.
  5. Feature Extraction: After training, you can use the RBM as a feature extractor. Transform your data using the RBM to obtain learned features.
  6. Application: Apply these learned features to various machine learning tasks like classification, regression, or clustering.

Example Implementation of RBM Model

Importing Libraries

  • First, we will import the essential libraries needed to demonstrate the example.

Python3




import numpy as np
from sklearn.datasets import load_digits
from sklearn.neural_network import BernoulliRBM
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import classification_report
 
import matplotlib.pyplot as plt


  • Next we will load the dataset. Here, we are using the Classification Digits Dataset.

Python3




# Load a dataset (for this example, we'll use the digits dataset)
digits = datasets.load_digits()
X = digits.data
y = digits.target


  • For the preprocessing of the data, we are standardizing the dataset using standard scaler. You can use other methods for doing so.

Python3




# Preprocess the data (you may need different preprocessing for your specific dataset)
scaler = StandardScaler()
X = scaler.fit_transform(X)
X_train, X_test, Y_train, Y_test = train_test_split(X, y,test_size=0.2,random_state=42)


  • Then, we will initialize a Bernoulli RBM model. A Bernoulli RBM model takes the following parameters:
    • n_components: It determines the dimensionality of the features that the RBM will learn during training. Here we are taking 64 units.
    • learning_rate: This controls how much the RBM’s parameters (weights and biases) are updated at each iteration of training. For our example, its 0.1.
    • n_iter: It is similar to number of epochs the training will run for, which is 20 for our example.
    • batch_size: specifies the number of samples used in each mini-batch during training
    • verbose: It determines the verbosity level. If value is 0, the model runs in silent mode.
    • random_state: used to set a random seed for result reproducibility.
  • After initializing, we will fit the model on our input and then transform the data to be used as the feature representations.

Python3




knn.fit(X_train, Y_train)
 
Y_pred = knn.predict(X_test)
print(
    "KNN without using RBM:\n",
    classification_report(Y_test, Y_pred)
)


Output:

KNN without using RBM:
precision recall f1-score support

0 1.00 1.00 1.00 33
1 0.97 1.00 0.98 28
2 0.97 1.00 0.99 33
3 0.97 0.97 0.97 34
4 0.98 1.00 0.99 46
5 0.96 0.96 0.96 47
6 0.97 1.00 0.99 35
7 1.00 0.94 0.97 34
8 0.97 0.97 0.97 30
9 0.95 0.90 0.92 40

accuracy 0.97 360
macro avg 0.97 0.97 0.97 360
weighted avg 0.97 0.97 0.97 360

Python3




knn = KNeighborsClassifier(n_neighbors=7, algorithm='kd_tree')
 
rbm = BernoulliRBM(n_components=625, learning_rate=0.00001, n_iter=10, verbose=False, random_state=42)
 
rbm_features_classifier = Pipeline(steps=[("rbm", rbm), ("KNN", knn)])
# Training RBM-Logistic Pipeline
rbm_features_classifier.fit(X_train, Y_train)


Output:

Pipeline

BernoulliRBM

KNeighborsClassifier
KNeighborsClassifier(algorithm='kd_tree', n_neighbors=7)

Python3




Y_pred = rbm_features_classifier.predict(X_test)
print(
    "KNN using RBM features:\n",
    classification_report(Y_test, Y_pred)
)


Output:

KNN using RBM features:
precision recall f1-score support

0 1.00 1.00 1.00 33
1 0.93 1.00 0.97 28
2 0.94 1.00 0.97 33
3 0.97 0.97 0.97 34
4 0.98 1.00 0.99 46
5 0.98 0.96 0.97 47
6 0.97 1.00 0.99 35
7 1.00 0.94 0.97 34
8 0.97 0.93 0.95 30
9 0.95 0.90 0.92 40

accuracy 0.97 360
macro avg 0.97 0.97 0.97 360
weighted avg 0.97 0.97 0.97 360
  • Lastly, we will visualize the original data and the transformed feature representation.

Python3




# Now you can use X_transformed as your feature representation
# Visualize the original and transformed data (just as an example)
fig, axes = plt.subplots(1, 2, figsize=(10, 5))
axes[0].imshow(X[0].reshape(8, 8), cmap=plt.cm.gray_r, interpolation='nearest')
axes[0].set_title('Original Image')
axes[1].imshow(X_transformed[0].reshape(8, 8), cmap=plt.cm.gray_r, interpolation='nearest')
axes[1].set_title('Transformed Image')
 
plt.show()


Output:

Original and transformed - Geeksforgeeks

Restricted Boltzmann Machine

Conclusion

In this article, we explored the concepts and steps involved in training and using Restricted Boltzmann Machines (RBMs). RBMs are powerful generative models that have been used for various applications, including dimensionality reduction, feature learning, and collaborative filtering. We discussed the training and inference procedures of RBMs, along with an example of collaborative filtering using RBMs.

By understanding the principles and techniques behind RBMs, you can leverage this algorithm for a wide range of machine learning tasks. Whether it’s generating new samples or making personalized recommendations, RBMs offer a valuable tool in your machine learning toolkit.

Additional Information:

RBMs can also be stacked to form deep belief networks (DBNs), which have shown remarkable performance in various tasks.

RBMs have been used in unsupervised pre-training for deep neural networks, enabling better generalization and faster convergence.

RBMs are known for their ability to capture complex dependencies in high-dimensional data, making them suitable for tasks such as image recognition and natural language processing.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads