Open In App

Sparse Coding with a Precomputed Dictionary in Scikit Learn

Last Updated : 31 Jan, 2023
Like Article

A sparse array/matrix is a special type of matrix whose most of the elements are having a value of zero. Generally, the number of non-zero elements in a sparse matrix is equal to the number of rows or columns of the matrix. So, sparse coding can be defined as a representation learning method that is used to find a sparse representation of the input dataset. The representation format is a linear combination of basic elements as well as those basic elements themselves. These elements are called atoms and these atoms compose the dictionary. The elements/atoms in the dictionary may not be orthogonal but rather may be an over-complete spanning set. 

Here, we are going to transform a signal into a sparse combination of Ricker dictionary/wavelet. The Ricker(or Mexican Hat) wavelet is defined as the second derivative of the Gaussian function or the third derivative of the normal-probability density function. To implement the sparse coding method we need to know some of its basic parameters and attributes given below. 


  • transform_algorithm : {‘lasso_lars’, ‘lasso_cd’, ‘lars’, ‘omp’, ‘threshold’}, default=’omp’ :- These algorithms are used for transformation of data. Here we have used lasso_lars and omp.
    • lasso_lars : It uses Lars(it uses least angle regression method) to compute the Lasso solution. 
    • omp : It uses orthogonal matching pursuit to estimate the sparse solution.
  • transform_n_nonzero_coefs : int, default=None :- This the total number of non-zero coefficients to target in each column of the solution. This is only used by omp and lars. It is overridden by alpha in the omp algorithm.
  • transform_max_iter : int, default=1000 :- The maximum number of iterations to perform in algorithm lasso_lars.
  • transform_alpha : float, default=None :- Algorithm lasso_lars  uses alpha as penalty applied to the L1 norm. During threshold computing alpha is the absolute value of the threshold below which coefficients will be squashed to zero. For omp algorithm alpha is the tolerance parameter whose value of the reconstruction error targeted.


  • n_features_in_ : int :- The total number of features seen during model fitting.
  • n_components_ : int :- It is the total number of elements/atoms. 

Importing libraries and generating sample dataset

By using python libraries like NumPy, Matplotlib, and Sklearn we can easily compute complex computations with a single line of code. After that, we will generate our sample data which will be used later. Here we will mainly define resolution, width, subsampling factor, and the Total number of components used and generate a signal which will be transformed as a sparse combination of Ricker wavelets.


# importing python libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.decomposition import SparseCoder
# defining sample features
resolution = 2048
subSmlFact = 32
width = 128
noOfComponents = resolution // subSmlFact
linewidth = 2
estimators = [
    ("OMP", "omp", None, 16, "darkgreen"),
    ("Lasso", "lasso_lars", 2, None, "cyan"),
# Generate a signal
y = np.linspace(0, resolution - 2, resolution)
first_quarter = y < resolution / 4
y[first_quarter] = 4.0
y[np.logical_not(first_quarter)] = -1.0

Computing ricker Dictionary

For computing the ricker dictionary we need to define the function and a matrix by which we will generate the ricker wavelet dictionary. Here we will generate fixed-width and multiple widths dictionaries. 


# defining ricker function
def rickerFunc(resolution, center, width):
    x = np.linspace(0, resolution - 1, resolution)
    x = (
        (2 / (np.sqrt(3 * width) * np.pi**0.25))
        * (1 - (x - center) ** 2 / width**2)
        * np.exp(-((x - center) ** 2) / (2 * width**2))
    return x
# defining ricker matrix
def rickerMatrix(width, resolution, n_components):
    centers = np.linspace(0, resolution - 1, n_components)
    Dict = np.empty((n_components, resolution))
    for i, center in enumerate(centers):
        Dict[i] = rickerFunc(resolution, center, width)
    Dict /= np.sqrt(np.sum(Dict**2, axis=1))[:, np.newaxis]
    return Dict
# Computing wavelet dictionary(fixed-width and multi-widhts)
fixedDict = rickerMatrix(
    width=width, resolution=resolution, n_components=noOfComponents)
multiDict = np.r_[
        rickerMatrix(width=w, resolution=resolution,
                     n_components=noOfComponents // 5)
        for w in (10, 50, 100, 500, 1000)

Plotting Results and visualizing the comparison

Here we will define a nested for loop where the 1st for loop will iterate for fixed and multi-width dictionaries and the 2nd/inner for loop will iterate for estimators(OMP, lasso).


plt.figure(figsize=(14, 6))
# defining nested for-loop
for subplot, (Dict, title) in enumerate(
    zip((fixedDict, multiDict), ("fixed width",
                                 "multiple widths"))
    plt.subplot(1, 2, subplot + 1)
    plt.title("Sparse coding against %s dictionary" % title)
    plt.plot(y, lw=linewidth, color="red",
             label="Original signal")
    # computing a wavelet approximation
    for title, algo, alpha, n_nonzero,
    color in estimators:
        sparseCoder = SparseCoder(
        x = sparseCoder.transform(y.reshape(1, -1))
        density = len(np.flatnonzero(x))
        x = np.ravel(, Dict))
        squared_error = np.sum((y - x) ** 2)
            label="%s: %s non-zero coefficients,\n%.2f error" % (
                title, density, squared_error),


Sparse Coding with a Precomputed Dictionary in Scikit Learn


After that, we will also use one more transformation algorithm named Threshold and finally visualize the results.


# Soft thresholding debiasing
sparseCoder = SparseCoder(
x = sparseCoder.transform(y.reshape(1, -1))
_, idx = np.where(x != 0)
x[0, idx], _, _, _ = np.linalg.lstsq(Dict[idx, :].T,
                                     y, rcond=None)
x = np.ravel(, Dict))
squared_error = np.sum((y - x) ** 2)
    label="Thresholding w/ debiasing:\n%d non-zero\
        coefficentss, %.2f error"
    % (len(idx), squared_error),
plt.legend(shadow=False, loc="best")
# adjusting graph view
plt.subplots_adjust(0.04, 0.1, 0.98, 0.92, 0.08, 0.2)


Sparse Coding with a Precomputed Dictionary in Scikit Learn


Similar Reads

Classification of text documents using sparse features in Python Scikit Learn
Classification is a type of machine learning algorithm in which the model is trained, so as to categorize or label the given input based on the provided features for example classifying the input image as an image of a dog or a cat (binary classification) or to classify the provided picture of a living organism into one of the species from within t
5 min read
Sparse Inverse Covariance Estimation in Scikit Learn
Sparse inverse covariance (also known as the precision matrix) is a statistical technique used to estimate the inverse covariance matrix of a dataset. The goal of this technique is to find a sparse estimate of the precision matrix, which means that many of the entries in the matrix are set to zero. This can be useful for identifying relationships b
3 min read
Map Data to a Normal Distribution in Scikit Learn
A Normal Distribution, also known as a Gaussian distribution, is a continuous probability distribution that is symmetrical around its mean. It is defined by its norm, which is the center of the distribution, and its standard deviation, which is a measure of the spread of the distribution. The normal distribution is often used to model continuous an
5 min read
Probability Calibration for 3-class Classification in Scikit Learn
Probability calibration is a technique to map the predicted probabilities of a model to their true probabilities. The probabilities predicted by some classification algorithms like Logistic Regression, SVM, or Random Forest may not be well calibrated, meaning they may not accurately reflect the true probabilities of the target classes. This can lea
4 min read
Clustering Performance Evaluation in Scikit Learn
In this article, we shall look at different approaches to evaluate Clustering Algorithms using Scikit Learn Python Machine Learning Library. Clustering is an Unsupervised Machine Learning algorithm that deals with grouping the dataset to its similar kind data point. Clustering is widely used for Segmentation, Pattern Finding, Search engine, and so
3 min read
How to Get Regression Model Summary from Scikit-Learn
In this article, we are going to see how to get a regression model summary from sci-kit learn. It can be done in these ways: Scikit-learn PackagesStats model packageExample 1: Using scikit-learn. You may want to extract a summary of a regression model created in Python with Scikit-learn. Scikit-learn does not have many built-in functions for analyz
3 min read
Ledoit-Wolf vs OAS Estimation in Scikit Learn
Generally, Shrinkage is used to regularize the usual covariance maximum likelihood estimation. Ledoit and Wolf proposed a formula which is known as the Ledoit-Wolf covariance estimation formula; This close formula can compute the asymptotically optimal shrinkage parameter with minimizing a Mean Square Error(MSE) criterion feature. After that, one r
4 min read
How to Install Scikit-Learn on Linux?
In this article, we are going to see how to install Scikit-Learn on Linux. Scikit-Learn is a python open source library for predictive data analysis. It is built on NumPy, SciPy, and matplotlib. It is written in Python, Cython, C, and C++ language. It is available for Linux, Unix, Windows, and Mac. Method 1: Installing Scikit-Learn from source Step
2 min read
Isotonic Regression in Scikit Learn
Isotonic regression is a regression technique in which the predictor variable is monotonically related to the target variable. This means that as the value of the predictor variable increases, the value of the target variable either increases or decreases in a consistent, non-oscillating manner. Mathematically, isotonic regression can be formulated
6 min read
Plot Multinomial and One-vs-Rest Logistic Regression in Scikit Learn
Logistic Regression is a popular classification algorithm that is used to predict the probability of a binary or multi-class target variable. In scikit-learn, there are two types of logistic regression algorithms: Multinomial logistic regression and One-vs-Rest logistic regression. Multinomial logistic regression is used when the target variable ha
4 min read