The matrix multiplication is an integral part of scientific computing. It becomes complicated when the size of the matrix is huge. One of the ways to easily compute the product of two matrices is to use methods provided by PyTorch. This article covers how to perform matrix multiplication using PyTorch.
PyTorch and tensors:
It is a package that can be used for neural network-based deep learning projects. It is an open-source library developed by Facebook’s AI research team. It can replace NumPy with its power of GPU. One of the important classes provided by this library is Tensor. It is nothing but the n-dimensional arrays as provided by the NumPy package. There are so many methods in PyTorch that can be applied to Tensor, which makes computations faster and easy. The Tensor can hold only elements of the same data type.
Matrix multiplication with PyTorch:
The methods in PyTorch expect the inputs to be a Tensor and the ones available with PyTorch and Tensor for matrix multiplication are:
- @ operator.
This method computes matrix multiplication by taking an m×n Tensor and an n×p Tensor. It can deal with only two-dimensional matrices and not with single-dimensional ones. This function does not support broadcasting. Broadcasting is nothing but the way the Tensors are treated when their shapes are different. The smaller Tensor is broadcasted to suit the shape of the wider or larger Tensor for operations. The syntax of the function is given below.
torch.mm(Tensor_1, Tensor_2, out=None)
The parameters are two Tensors and the third one is an optional argument. Another Tensor to hold the output values can be given there.
Example-1: Matrices of the same dimension
Here both the inputs are of same dimensions. Thus, the output will also be of the same dimension.
tensor([[10, 28, 28], [27, 73, 62], [13, 37, 53]])
Example2: Matrices of a different dimension
Here tensor_1 is of 2×2 dimension, tensor_2 is of 2×3 dimension. So the output will be of 2×3.
tensor([[1.4013e-45, 0.0000e+00, 2.8026e-45], [0.0000e+00, 5.6052e-45, 0.0000e+00]])
This method allows the computation of multiplication of two vector matrices (single-dimensional matrices), 2D matrices and mixed ones also. This method also supports broadcasting and batch operations. Depending upon the input matrices dimensions, the operation to be done is decided. The general syntax is given below.
torch.matmul(Tensor_1, Tensor_2, out=None)
The table below lists the various possible dimensions of the arguments and the operations based on it. argument_1 argument_2 Action taken
1-dimensional 1-dimensional The scalar product is calculated 2-dimensional 2-dimensional General matrix multiplication is done 1-dimensional 2-dimensional The tensor-1 is pretended with a ‘1’ to match dimension of tensor-2 2-dimensional 1-dimensional Matrix-vector product is calculated 1/N-dimensional (N>2) 1/N-dimensional (N>2) Batched matrix multiplication is done
Example1: Arguments of the same dimension
Single dimensional tensors : tensor(36) 3x3 dimensional tensors : tensor([[10, 28, 28], [27, 73, 62], [13, 37, 53]])
Example2: Arguments of different dimensions
1D-2D multiplication : tensor([29, 38, 61]) 2D-1D multiplication : tensor([21, 61, 59])
Example3: N-dimensional argument (N>2)
matrix A : tensor([[[ 0.5433, 0.0546, -0.5301], [ 0.9275, -0.0420, -1.3966], [-1.1851, -0.2918, -0.7161]], [[-0.8659, 1.8350, 1.6068], [-1.1046, 1.0045, -0.1193], [ 0.9070, 0.7325, -0.4547]]]) matrix B : tensor([ 1.8785, -0.4231, 0.1606]) Output : tensor([[ 0.9124, 1.5358, -2.2177], [-2.1448, -2.5191, 1.3208]])
This method provides batched matrix multiplication for the cases where both the matrices to be multiplied are of only 3-Dimensions (x×y×z) and the first dimension (x) of both the matrices must be same. This does not support broadcasting. The syntax is as given below.
torch.bmm( Tensor_1, Tensor_2, deterministic=false, out=None)
The “deterministic” parameter takes up boolean value. A ‘false‘ does a faster calculation which is non-deterministic. A ‘true‘ does a slower calculation however, it is deterministic.
In the example below, the matrix_1 is of dimension 2×3×3. The second matrix is of dimension 2×3×4.
matrix A : tensor([[[-0.0135, -0.9197, -0.3395], [-1.0369, -1.3242, 1.4799], [-0.0182, -1.2917, 0.6575]], [[-0.3585, -0.0478, 0.4674], [-0.6688, -0.9217, -1.2612], [ 1.6323, -0.0640, 0.4357]]]) matrix B : tensor([[[ 0.2431, -0.1044, -0.1437, -1.4982], [-1.4318, -0.2510, 1.6247, 0.5623], [ 1.5265, -0.8568, -2.1125, -0.9463]], [[ 0.0182, 0.5207, 1.2890, -1.3232], [-0.2275, -0.8006, -0.6909, -1.0108], [ 1.3881, -0.0327, -1.4890, -0.5550]]]) Output : tensor([[[ 0.7954, 0.5231, -0.7752, -0.1756], [ 3.9031, -0.8274, -5.1288, -0.5915], [ 2.8488, -0.2372, -3.4850, -1.3212]], [[ 0.6532, -0.1637, -1.1251, 0.2633], [-1.5532, 0.4309, 1.6527, 2.5167], [ 0.6492, 0.8870, 1.4994, -2.3371]]])
** Note: the matrices vary for each run as random values are filled dynamically.
The @ – Simon H operator, when applied on matrices performs multiplication element-wise on 1D matrices and normal matrix multiplication on 2D matrices. If both the matrices have the same dimension, then the matrix multiplication is carried out normally without any broadcasting/prepending. If any one of the matrices is of a different dimension, then appropriate broadcasting is carried out first and then the multiplication is carried out. This operator applies to N-Dimensional matrices also.
1D matrices output : tensor(36) 2D matrices output : tensor([[10, 28, 28], [27, 73, 62], [13, 37, 53]]) N-D matrices output : tensor([[[ 0.7953, 0.5231, -0.7751, -0.1757], [ 3.9030, -0.8274, -5.1287, -0.5915], [ 2.8487, -0.2372, -3.4850, -1.3212]], [[ 0.6531, -0.1637, -1.1250, 0.2633], [-1.5532, 0.4309, 1.6526, 2.5166], [ 0.6491, 0.8869, 1.4995, -2.3370]]]) Mixed matrices output: tensor([218, 596, 562])
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course.