Cosine Similarity Calculation Between Two Matrices in MATLAB

MATLAB (Matrix Laboratory) is a high-level programming language and numerical computing environment for performing complex mathematical computations and simulations. It is used in a wide range of applications including signal and image processing, control systems, and engineering and scientific calculation.

Cosine Similarity

Cosine Similarity is a measure of similarity between two non-zero vectors of an inner product space. It calculates the cosine of the angle between the two vectors and returns a value between -1 and 1, where 1 means the vectors are perfectly similar and -1 means they are perfectly dissimilar. Cosine similarity is often used in natural language processing and information retrieval to determine the similarity between documents or text.

Steps:

To calculate cosine similarity between two matrices in MATLAB, you can follow these steps:

Load or define the two matrices A and B that you want to compare.
Normalize the matrices to ensure that they have unit length, by dividing each row of the matrices by its Euclidean norm. You can use the norm function to do this.
Calculate the dot product between each row of the two matrices. You can use the dot function to do this.
Calculate the cosine similarity between each row by dividing the dot product by the product of the Euclidean norms of the two rows.
If desired, you can calculate the average cosine similarity between all rows.

Example 1:

Matlab

% Load or define matrices A and B 
A = [1 2; 3 4]; 
B = [5 6; 7 8]; 

  
% Normalize the matrices 
A = A ./ norm(A,2); 
B = B ./ norm(B,2); 

  
% Calculate dot product  
% between each row of the matrices 
dot_product = A * B'; 

  
% Calculate cosine similarity 
% between each row 
cosine_similarity = dot_product ./ (norm(A,2) .* norm(B,2)') 

  
% Average cosine similarity  
% between all rows (optional) 
mean_cosine_similarity = mean(mean(cosine_similarity))

Output:

Output

Example 2:

Matlab

% Load or define matrices A and B 
A = [1 2; 3 4]; 
B = [5 6; 7 8]; 

  
% Normalize the matrices 
A = bsxfun(@rdivide, A, sqrt(sum(A.^2,2))); 
B = bsxfun(@rdivide, B, sqrt(sum(B.^2,2))); 

  
% Calculate dot product between 
% each row of the matrices 
dot_product = A * B'; 

  
% Calculate cosine similarity  
% between each row 
cosine_similarity = dot_product; 

  
% Average cosine similarity  
% between all rows (optional) 
mean_cosine_similarity = mean(mean(cosine_similarity)); 

Output:

Output image

Code explanation:

The first two lines define the two matrices A and B that you want to compare.
The next two lines normalize the matrices to ensure that they have unit length. The bsxfun function is used to divide each row of the matrices by the square root of the sum of squares of the elements in that row. This step is necessary because cosine similarity is based on the angle between vectors and the cosine of the angle between two vectors is equal to their dot product divided by the product of their magnitudes.
The fifth line calculates the dot product between each row of the two matrices. The result is a matrix where each element is the dot product between a pair of rows from A and B.
The sixth line calculates the cosine similarity between each row. Since the matrices have already been normalized, the dot product between each pair of rows is equal to the cosine of the angle between them and can be used directly as the cosine similarity.
The final line calculates the average cosine similarity between all rows if desired. This is done by taking the mean of the cosine_similarity matrix.

Note: In the code, the normalization step is done using bsxfun, which is a built-in MATLAB function for element-wise operations. An alternative method for normalization would be to use a for-loop to iterate over each row of the matrices and normalize each row individually.

Advantages of Cosine Similarity Between Two Matrices:

It is a measure of similarity: Cosine similarity measures the similarity between two vectors or matrices based on their angle.
Robustness to magnitude: Cosine similarity is insensitive to the magnitude of the vectors, which makes it a useful tool for comparing vectors that might have very different magnitudes.
Normalization: Cosine similarity is normalized between -1 and 1, which makes it easier to interpret and compare results.
Simple to implement: Cosine similarity is a simple calculation that can be performed with basic linear algebra operations, making it easy to implement in many programming languages.
Widely used: Cosine similarity is a widely used technique in information retrieval, text classification, and other areas where comparing the similarity of vectors is important.
Works well with sparse data: Since cosine similarity is based on the angle between vectors, it is well-suited for comparing sparse vectors that have many zero elements.

Disadvantages of Cosine Similarity Between Two Matrices:

Not sensitive to magnitude differences: Cosine similarity is not sensitive to the magnitude of the vectors, which can be a disadvantage in some cases where magnitude is important.
Angle only: Cosine similarity is based solely on the angle between vectors, which means it may not capture all aspects of the similarity between two matrices.
Not suitable for negative similarities: Cosine similarity is only capable of capturing positive similarities between two matrices, as it ranges from -1 to 1. If negative similarities are important, another similarity measure may be more suitable.
May not capture context: Cosine similarity is a global similarity measure, meaning it does not capture the context or structure of the data. If context or structure is important, other similarity measures or techniques may be more suitable.
Not appropriate for comparing high-dimensional data: Cosine similarity can become less meaningful as the dimensionality of the data increases. For high-dimensional data, other similarity measures or techniques may be more appropriate.

Article Tags :

MATLAB

MATLAB Matrix-Programs