Open In App

Distance Between Rows in R

Last Updated : 23 Sep, 2022
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will learn various approaches to calculating the distance between the given rows in the R programming language.

The dist() function in R is used to calculate a wide range of distances between the specified vector elements of the matrix in R. The default method for distance computation is the “Euclidean distance,” which is widely used in mathematics. It has the following syntax :

Syntax: dist(vect, method = ” “, diag = TRUE or FALSE, upper = TRUE or FALSE)

Parameters:

  • vect: A two-dimensional vector
  • method: The distance to be measured. It must be equal to one of these, “euclidean”, “maximum”, “manhattan”, “canberra”, “binary” or “minkowski”
  • diag: logical value (TRUE or FALSE) that conveys whether the diagonal of the distance matrix should be printed by print.dist or not.
  • upper: logical value (TRUE or FALSE) that conveys whether the upper triangle of the distance matrix should be printed by print.dist or not.

Return type:

It return an object of class “dist”

Calculating Euclidean Distance in R to Get the Distance between Rows

In mathematics, the euclidean distance between any two points is described as the length of the line segments between them, It is also known as the straight line distance. The Euclidean distance between any two-row vectors A and B of the matrix can be given by the following formula,

Euclidean distance = √Σ(vect1i - vect2i)2 

Where ,

  • vect1 is the first vector
  • vect2 is the second vector

R




# creating matrix
matr <- matrix(1:12, nrow = 4)
print("Original Matrix")
print(matr)
  
# calculating the distance between
# rows of matrix                                        
print("Euclidean Distance between rows of matrix")
dist(matr)


Output:

     [,1] [,2] [,3]
[1,]    1    5    9
[2,]    2    6   10
[3,]    3    7   11
[4,]    4    8   12
[1] "Euclidean Distance between rows of matrix"
         1        2        3
2 1.732051                  
3 3.464102 1.732051         
4 5.196152 3.464102 1.732051

Explanation : 

  • The Euclidean distance between row1 and row2 is 1.732051
  • The Euclidean distance between row1 and row3 is 3.464102
  • The Euclidean distance between row1 and row4 is 5.196152
  • The Euclidean distance between row2 and row2 is 0
  • The Euclidean distance between row2 and row3 is 1.732051
  • The Euclidean distance between row2 and row4 is 3.464102
  • The Euclidean distance between row3 and row3 is 0
  • The Euclidean distance between row3 and row4 is 1.732051

Calculating Canberra Distance in R to Get the Distance between Rows

The Canberra distance between any two pairwise elements of the specified rows of the matrix can be given by the following equation : 

∑ |vect1i - vect2i| / (|vect1i| + |vect2i|)

The dist() method can be customized by specifying the method name equivalent to “Canberra”. The result is the matrix with a row less than the input data frame and cell values indicating the distance between the rows. 

Syntax: dist(vect, method = “canberra”, diag = TRUE or FALSE, upper = TRUE or FALSE)

R




# Creating matrix
matr <- matrix(1:12, nrow = 4)
print("Original Matrix")
print(matr)
  
# Calculating the distance 
# between rows of matrix                                        
print("Canberra Distance between rows of matrix")
dist(matr,method="canberra")


Output:

     [,1] [,2] [,3]
[1,]    1    5    9
[2,]    2    6   10
[3,]    3    7   11
[4,]    4    8   12
[1] "Canberra Distance between rows of matrix"
          1         2         3
2 0.4768740                    
3 0.7666667 0.3245421          
4 0.9736264 0.5670996 0.2530021

Calculating Maximum Distance in R to Get the Distance between Rows

The maximum distance between all the matrix rows, with two rows taken at a time, is always a matrix of integral values indicating the maximum number of rows between any two pairs of input rows. It can be easily calculated by the dist() method by modifying it to specify the method name as “maximum”.

R




# Creating matrix
matr <- matrix(1:12, nrow = 4)
print("Original Matrix")
print(matr)
  
# Calculating the distance between
# rows of matrix                                        
print("Maximum Distance between rows of matrix")
dist(matr,method="maximum")


Output:

    [,1] [,2] [,3]
[1,]    1    5    9
[2,]    2    6   10
[3,]    3    7   11
[4,]    4    8   12
[1] "Maximum Distance between rows of matrix"
  1 2 3
2 1    
3 2 1  
4 3 2 1

Calculating binary distance in R to Get the Distance between Rows

Two vectors share a part of their elements in between. Sometimes, this proportion of elements shared maybe 0. The binary distance illustrates this. The input vectors are specified, and then the binary distance between them is calculated using the dist() method in R with a method equivalent to “binary.”

Syntax: dist(vect, method = “binary”, diag = TRUE or FALSE, upper = TRUE or FALSE)

R




# Creating matrix
matr <- matrix(1:12, nrow = 4)
print("Original Matrix")
print(matr)
  
# Calculating the distance between 
# rows of matrix                                        
print("Maximum Distance between rows of matrix")
dist(matr,method="binary")


Output:

     [,1] [,2] [,3]
[1,]    1    5    9
[2,]    2    6   10
[3,]    3    7   11
[4,]    4    8   12
[1] "Maximum Distance between rows of matrix"
  1 2 3
2 0    
3 0 0  
4 0 0 0

Calculating Minkowski Distance in R to Get the Distance between Rows

The Minkowski distance is considered a generalization of both the Euclidean and the Manhattan distance. The following equation gives the Minkowski length between any two-row vectors of the input matrix: 

(Σ|vect1i - vect2i|p)1/p

where,

  • vect1 is the first vector
  • vect2 is the second vector
  • p is an integer

Syntax: dist(vect, method = “minkowski”, p = integer, diag = TRUE or FALSE, upper = TRUE or FALSE) 

The following code snippet illustrates the integral value of p equal to 3 for calculating the Minkowski distance between the row vectors. 

R




# Creating matrix
matr <- matrix(1:12, nrow = 4)
print("Original Matrix")
print(matr)
  
# Calculating the distance between
# rows of matrix                                        
print("Maximum Distance between rows of matrix")
dist(matr, method="minkowski", p=3)


Output:

     [,1] [,2] [,3]
[1,]    1    5    9
[2,]    2    6   10
[3,]    3    7   11
[4,]    4    8   12
[1] "Maximum Distance between rows of matrix"
         1        2        3
2 1.442250                  
3 2.884499 1.442250         
4 4.326749 2.884499 1.442250


Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads