QR Decomposition in Machine learning

Last Updated : 01 Feb, 2024

QR decomposition is a way of expressing a matrix as the product of two matrices: Q (an orthogonal matrix) and R (an upper triangular matrix). In this article, I will explain decomposition in Linear Algebra, particularly QR decomposition among many decompositions.

What is QR Decomposition?

Decomposition or Factorization is dividing the original single entity into multiple entities for easiness. Decomposition has various applications in numerical linear algebra, optimization, solving systems of linear equations, etc. QR decomposition is a versatile tool in numerical linear algebra that finds applications in solving linear systems, least squares problems, eigenvalue computations, etc. Its numerical stability and efficiency make it a valuable technique in a range of applications.

QR decomposition, also known as QR factorization, is a fundamental matrix decomposition technique in linear algebra. QR decomposition is a matrix factorization technique that decomposes a matrix into the product of an orthogonal matrix (Q) and an upper triangular matrix (R). Given a matrix A (m x n), where m is the number of rows and n is the number of columns, the QR decomposition can be expressed as:

$A= QR$

QR Decomposition in Machine learning

QR decomposition finds widespread use in machine learning for tasks like solving linear regression, eigenvalue problems, Gram-Schmidt orthogonalization, handling overdetermined systems, matrix inversion, Gram matrix factorization, and enhancing numerical stability in various algorithms. More details about it, is in the application section.

QR Decomposition Related Concepts

Matrix Factorization: Matrix factorization involves expressing a matrix as the product of two or more matrices. In QR decomposition, we express a given matrix A as the product of an orthogonal matrix Q and an upper triangular matrix R.
Orthogonal Matrix: An orthogonal matrix Q has the property that its transpose is equal to its inverse (Q^T * Q = I, where I is the identity matrix).
- Properties: Orthogonal matrices preserve the length of vectors and the dot product. They play a crucial role in QR decomposition.
Upper Triangular Matrix: A matrix is upper triangular if all entries below the main diagonal are zero. In QR decomposition, R is an upper triangular matrix.
Gram-Schmidt Process (Orthogonalization Process): The Gram-Schmidt process is used to orthogonalize a set of vectors. In the context of QR decomposition, it is applied to the columns of the original matrix to construct an orthogonal matrix Q.

Compute QR decomposition:

Gram-Schmidt Orthogonalization

The Gram-Schmidt process is often used to orthogonalize the columns of the matrix A. It produces an orthogonal matrix Q.

Given a matrix A,
$A = \begin{bmatrix} a_{11} & \cdots & a_{n1}\\ \vdots & \ddots & \vdots\\ a_{1m} & \cdots & a_{nm} \end{bmatrix} \\ A = [a_1, a_2,..., a_n]$ ,
where, a_i is columns of A:

Initialize $q_1 = a_1$
For i =2 to n:
- $v_i = a_i - \sum_{j=1}^{i-1}{\text{proj}_{q_j}(a_i)}$ , here $\text{proj}_{q_j}(a_i)$ is the projection of $a_i$ onto $q_{j}$
  $\text{proj}_{q_j}(a_i)= \frac{a_i \cdot q_j}{||q_j||^2} \cdot q_j$
- $q_i = \frac{v_i}{||v_i||}$
This process produces an orthogonal matrix $Q = [q_1, q_2, q_3, ..., q_n]$

Triangularization

Once Q is obtained, the upper triangular matrix R is obtained by multiplying $Q^T$ with the original matrix A.

$R = Q^T .A$

The orthogonal matrix Q is used to triangularize the original matrix A, resulting in an upper triangular matrix R.

Result:

A = QR,

Here,

A is the original matrix
Q is orthogonal matrix
R is upper triangular

Orthogonal Matrix Property:

$Q^T .Q = I$

here,

$Q^T$ is the transpose of Q,
I is identity matrix.

Step by step Implementations

Using Gram-Schmidt Process:

First, perform normalization.

$q1 = \frac{1}{||a_1||} . a_1$

Here, $||a_1||$ denotes the norm of $a_1$

Then, we project a₂on q₁:

$a_2 = <a_2. q_1>q_1 + q_{2}^{'} \\ q_{2}^{'}=a_2 - <a_2. q_1>q_1$

Here,

$<a_2. q_1>$ is the inner product between $a_2$ and $q_1$
$q_{2}^{'}$ is the residual of the projection, orthogonal to $q_1$

After this project, we normalize the residuals:

$q_2 = \frac{1}{||q_{2}^{'}||}. q_{2}^{'}$

Then, we project a₃ on q₁and q₂:

$a_3 = <a_3.q_1>q_1 + <a_3.q_2>q_2 + q_{3}^{'} \\ q_{3}^{'}=a_3 - <a_3.q_1>q_1 - <a_3.q_2>q_2$

Here,

$q_{2}^{'}$ is residual which is orthogonal to $q_1$ and $q_2$

We repeatedly perform alternating steps of normalization, where projection residuals are divided by their norms, and projection steps, where a₁ is projected according to $(q_1, q_2, ..., q_{l-1})$ , until a set of orthonormal vectors is obtained as $q_1, ..., q_L$ .

Residuals are expressed in terms of normalized vectors as:

$q_l^{'} = ||q_l^{'}||.q_l$

for l =1, …, L , we define $q_1^{'} = a_1$

Therefore, we can write the projections as:

$a_1 = <a_1.q_1>.q_1 + ... + <a_l.q_{l-1}>q_{l-1} + ||q_{l}^{'}||q_l$

Then, we form a matrix using the orthogonal vectors:

$Q = [q_1 ... q_L]$

For computing R matrix, we will form an upper triangular square matrix:

$R = \begin{bmatrix} ||q_1'|| & <a_2.q_1> & <a_3.q_1> & \cdots & <a_L.q_1> \\ 0& ||q_2'|| & <a_3.q_2> & \cdots & <a_L.q_2>\\ 0& 0 & ||q_3'|| & & \vdots \\ \vdots & \vdots & & \ddots & <a_L. q_{L-1}> \\ 0& 0 & \cdots & 0 & ||q_L'|| \end{bmatrix}$

If, we compute Q and R, we will get the matrix.

QR Decomposition using Python

Python3

import numpy as np
 
# Create a numpy array
arr = np.array([[1, 2, 4], [0, 0, 5],
                [0, 3, 6]])
 
print(arr)
 
# Find the QR factor of array
q, r = np.linalg.qr(arr)
print('\nQ:\n', q)
print('\nR:\n', r)
print(np.allclose(arr, np.dot(q, r)))  # to check result is correct or not

Output:

[[1 2 4]
 [0 0 5]
 [0 3 6]]
Q:
 [[ 1.  0.  0.]
 [ 0.  0. -1.]
 [ 0. -1.  0.]]
R:
 [[ 1.  2.  4.]
 [ 0. -3. -6.]
 [ 0.  0. -5.]]
True

Mathematical explantions

Let’s understand the QR Decomposition process by

Suppose we are provided with the matrix A:

$A = \begin{bmatrix} 1 & 2 & 4 \\ 0 & 0 & 5 \\ 0 & 3 & 6 \end{bmatrix}$

As mentioned in the steps before, we will be using Gram-Schmidt Orthogonalization.

We will be finding orthogonal components q_{1 ,}q_{2 and}q₃ :

First, perform normalization and we get the first normalized vector:

$q1 = \frac{a1}{\|a1\|} = \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}$

The norm of the first column is calculated as:

$||a_1|| = \sqrt{1^2 +0^2 +0^2} = 1$

The inner product of between a₂and q₁is $<a_2. q_1>$ = $q^T. a_2$ is considered and the projection of the second column on $q_1$ is multiplied with the inner product.

$q_2^{'}$ is the residual of the projection:

$q2' = a2 - <a2 . q1> q1 \\[10pt] \hspace{0.55cm} = \begin{bmatrix} 2 \\ 0 \\ 3 \end{bmatrix} - 2 * \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} \\[10pt] \hspace{0.55cm} = \begin{bmatrix} 0 \\ 0 \\ 3 \end{bmatrix} \\[10pt]$

Now, we will normalize the residual:

$q2 = \frac{q2'}{\|q2'\|} = \frac{q2'}{3} = \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}$

Now, we will project a₃on q₁ and q₂:

$q3' = a3 - <a3 , q1> q1 - <a3, q2> q2 \\[10pt] \hspace{0.55cm} = \begin{bmatrix} 4 \\ 5 \\ 6 \end{bmatrix} - 4 * \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} - 6 * \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix} \\[10pt] \hspace{0.55cm} = \begin{bmatrix} 0 \\ 5 \\ 0 \end{bmatrix} \\[10pt]$

Now, we will normalize the residual. :

$q3 = \frac{q3'}{\|q3'\|} = \frac{q3'}{5} = \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}$

We got Q matrix.

$Q = \begin{bmatrix} 1& 0 &0 \\ 0& 0& 1\\ 0& 1& 0 \end{bmatrix}$

The given R is an upper triangular matrix.

$R = \begin{bmatrix} 1 & 2 & 3 \\ 0 & 3 & 6 \\ 0 & 0 & 5 \end{bmatrix}$

Mathematical Calculation (Q = [q1 q2 q3], so A = QR) value is different compared to python Numpy package. Reason described below.

Reason for difference of NumPy results and our calculation from steps:
The QR decomposition is not unique all the way down to the signs. One can flip signs in Q as long as you flip the corresponding signs in R. Some implementations enforce positive diagonals in R, but this is just a convention. Since NumPy defer to LAPACK for these linear algebra operations, we follow its conventions, which do not enforce such a requirement.

Applications:

It has many applications in algebra and machine learning whether it is for least square method, linear regression, PCA, eigenvalue problem or regularization of model in machine learning. Few of them are written below.

Linear Regression and Least Squares Problems: QR decomposition is commonly used in linear regression problems and least squares approximations. It allows for a numerically stable and efficient solution to the linear system of equations, especially when dealing with overdetermined systems.
Feature Selection and Rank Deficiency: In machine learning, datasets often have a high dimensionality, and some features may be correlated. QR decomposition can help identify linear dependencies between features and assist in feature selection. It is particularly useful when dealing with rank-deficient matrices.
Principal Component Analysis (PCA): PCA involves finding the principal components of a dataset, which are the directions of maximum variance. QR decomposition can be applied in the context of PCA to efficiently compute the principal components of a covariance matrix.
Regularization Techniques: When applying regularization techniques in machine learning models (e.g., Ridge Regression), QR decomposition can be employed to efficiently solve the regularized linear regression problems.
Eigenvalue Problems: QR decomposition can be part of algorithms for solving eigenvalue problems,such as spectral clustering and dimensionality reduction techniques like Locally Linear Embedding (LLE).
QR decomposition can be employed for tasks like filtering and denoising in signal processing.

Advantages

It allows for a numerically stable and efficient solution of system of equation.
Compared to LU decomposition, this method does not require that the decomposition be carried out on a square matrix.

Disadvantage:

Steps are very expensive to compute. The complexity of one step of the basic QR-method = O(n^3).
Usually, many steps (much more than n) are required to converge. In fact, the basic QR-method can be arbitrarily slow if the eigenvalues are close to each other.

Suggest improvement

Introduction to Quantum Machine Learning

Share your thoughts in the comments

QR Decomposition in Machine learning

What is QR Decomposition?

QR Decomposition Related Concepts

Compute QR decomposition:

Gram-Schmidt Orthogonalization

Triangularization

Orthogonal Matrix Property:

Step by step Implementations

QR Decomposition using Python

Python3

Mathematical explantions

We will be finding orthogonal components q1 , q2 and q3 :

Reason for difference of NumPy results and our calculation from steps:

Applications:

Advantages

Disadvantage:

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?

We will be finding orthogonal components q_{1 ,}q_{2 and}q₃ :