Open In App

How to Use Dist Function in R?

Last Updated : 24 Dec, 2021
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will see how to use dist() function in R programming language.  

R provides an inbuilt dist() function using which we can calculate six different kinds of distances between each unique pair of vectors in a two-dimensional vector. dist() method accepts a numeric matrix as an argument and a method that represent the type of distance to be measured. The method must be one of these distances – Euclidean, Maximum, Manhattan, Canberra, Binary, and Minkowski. It accepts other arguments also but they are optional.

Syntax:

dist(vect, method = ” “, diag = TRUE or FALSE, upper = TRUE or FALSE)

Parameters:

  • vect: A two-dimensional vector
  • method: The distance to be measured. It must be equal to one of these, “euclidean”, “maximum”, “manhattan”, “canberra”, “binary” or “minkowski”
  • diag: logical value (TRUE or FALSE) that conveys whether the diagonal of the distance matrix should be printed by print.dist or not.
  • upper: logical value (TRUE or FALSE) that conveys whether the upper triangle of the distance matrix should be printed by print.dist or not.

Return type:

It return an object of class “dist”

Now let us see how to calculate these distances using dist() function.

Euclidean Distance

Euclidean distance between two points in Euclidean space is basically the length of a line segment between the two points. It can be calculated from the cartesian coordinates of the points by taking the help of the Pythagorean theorem, therefore occasionally being called the Pythagorean distance. 

For example, In a 2-dimensional space having two points Point1 (x1,y1) and Point2 (x2,y2), the Euclidean distance is given by √(x1 – x2)2 + (y1 – y2)2.

The Euclidean distance between the two vectors is given by,  

√Σ(vect1i - vect2i)2

where,

  • vect1 is the first vector
  • vect2 is the second vector

For example, we are given two vectors, vect1 as (2, 1, 5, 8) and vect2 as (1, 2, 4, 9). Their Euclidean distance is given by, √(2 – 1)2 + (1 – 2)2 + (5 – 4)2 + (8 – 9)2   which is equal to 2.

Syntax:

dist(vect, method = “euclidean”, diag = TRUE or FALSE, upper = TRUE or FALSE)

Example:  Euclidean distance

R




# R program to illustrate how to calculate
# euclidean distance using dist() function
 
# Initializing a vector
vect1 <- c(1, 4, 8, 9, 2, 3)
 
# Initializing another vector
vect2 <- c(9, 4, 1, 2, 4, 7)
 
# Initializing another vector
vect3 <- c(1, 7, 9, 3, 2, 8)
 
# Initializing another vector
vect4 <- c(2, 1, 4, 7, 8, 9)
 
# Initializing another vector
vect5 <- c(1, 4, 8, 3, 9, 2)
 
# Initializing another vector
vect6 <- c(3, 7, 8, 6, 5, 9)
 
#Row bind vectors into a single matrix
twoDimensionalVect <- rbind(vect1, vect2, vect3,
                            vect4, vect5, vect6)
 
print("Euclidean distance between each pair of vectors is: ")
cat("\n\n")
 
# Calculate Euclidean distance between vectors using
# built in dist method By passing two-dimensional
# vector as a parameter Since we want to calculate
# Euclidean distance between each unique pair of vectors
# That is why we are passing Euclidean as a method
dist(twoDimensionalVect, method = "euclidean", diag = TRUE, upper = TRUE)


Output:

Manhattan Distance

Manhattan distance is a distance metric between two points in an N-dimensional vector space. It is defined as the sum of absolute distance between coordinates in corresponding dimensions. For example, in a 2-dimensional space having two points Point1 (x1 , y1) and Point2 (x2 , y2), the Manhattan distance is given by |x1 – x2| + |y1 – y2|. 

In R Manhattan distance is calculated with respect to vectors. The Manhattan distance between the two vectors is given by,  

Σ|vect1i - vect2i| 

where, 

  • vect1 is the first vector
  • vect2 is the second vector

For example, we are given two vectors, vect1 as (3, 6, 8, 9) and vect2 as (1, 7, 8, 10). Their Manhattan distance is given by, |3 – 1| + |6 – 7| + |8 – 8| + |9 – 10|  which is equal to 4.

Syntax:

dist(vect, method = “manhattan”, diag = TRUE or FALSE, upper = TRUE or FALSE)

Example: Manhattan distance

R




# R program to illustrate how to calculate
# Manhattan distance
# using dist() function
 
# Initializing a vector
vect1 <- c(1, 4, 8, 9, 2, 3)
 
# Initializing another vector
vect2 <- c(9, 4, 1, 2, 4, 7)
 
# Initializing another vector
vect3 <- c(1, 7, 9, 3, 2, 8)
 
 
# Initializing another vector
vect4 <- c(2, 1, 4, 7, 8, 9)
 
 
# Initializing another vector
vect5 <- c(1, 4, 8, 3, 9, 2)
 
# Initializing another vector
vect6 <- c(3, 7, 8, 6, 5, 9)
 
 
#Row bind vectors into a single matrix
twoDimensionalVect <- rbind(vect1, vect2, vect3,
                            vect4, vect5, vect6)
 
print("Manhattan distance between each pair of vectors is: ")
cat("\n\n")
 
# Calculate Manhattan distance between vectors using
# built in dist method By passing two-dimensional
# vector as a parameter Since we want to calculate
# Manhattan distance between each unique pair of vectors
# That is why we are passing Manhattan as a method
dist(twoDimensionalVect, method = "manhattan", diag = TRUE, upper = TRUE)


Output:

Maximum distance

The maximum distance between two vectors, A and B, is calculated as the maximum difference between any pairwise elements. In R maximum distance is calculated with respect to vectors. The maximum distance between two vectors is given by,  

max(|vect1i - vect2i|) 

where,

  • vect1 is the first vector
  • vect2 is the second vector

For example, we are given two vectors, vect1 as (3, 6, 8, 9) and vect2 as (1, 8, 9, 10). Their Maximum distance is given by, max(|3 – 1|, |6 – 8|, |8 – 9|, |9 – 10|)  which is equal to 2.

Syntax:

dist(vect, method = “maximum”, diag = TRUE or FALSE, upper = TRUE or FALSE)

Example: Maximum distance

R




# R program to illustrate how to calculate Maximum distance
# using dist() function
 
# Initializing a vector
vect1 <- c(1, 4, 8, 9, 2, 3)
 
# Initializing another vector
vect2 <- c(9, 4, 1, 2, 4, 7)
 
# Initializing another vector
vect3 <- c(1, 7, 9, 3, 2, 8)
 
 
# Initializing another vector
vect4 <- c(2, 1, 4, 7, 8, 9)
 
 
# Initializing another vector
vect5 <- c(1, 4, 8, 3, 9, 2)
 
# Initializing another vector
vect6 <- c(3, 7, 8, 6, 5, 9)
 
 
#Row bind vectors into a single matrix
twoDimensionalVect <- rbind(vect1, vect2, vect3, vect4, vect5, vect6)
 
print("Maximum distance between each pair of vectors is: ")
cat("\n\n")
 
# Calculate Maximum distance between vectors using
# built in dist method By passing two-dimensional
# vector as a parameter Since we want to calculate
# Maximum distance between each unique pair of vectors
# That is why we are passing Maximum as a method
dist(twoDimensionalVect, method = "maximum", diag = TRUE, upper = TRUE)


Output:

Canberra Distance

The Canberra distance is a numerical measure of the distance between pairs of points in a vector space. In R Canberra distance is calculated with respect to vectors. The Canberra distance between two vectors is given by,   

∑ |vect1i - vect2i| / (|vect1i| + |vect2i|) 

where,

  • vect1 is the first vector
  • vect2 is the second vector

For example, we are given two vectors, vect1 as (2, 2, 7, 5) and vect2 as (3, 8, 3, 5). Their Canberra distance is given by, |2 – 3| / (2 + 3) + |2 – 8| / (2 + 8) + |7 – 3| / (7 + 3) + |5 – 5| / (5 + 5)  = 0.2 + 0.6 + 0.4 + 0 which is equal to 1.2.

Syntax:

dist(vect, method = “canberra”, diag = TRUE or FALSE, upper = TRUE or FALSE)

Example: Canberra distance

R




# R program to illustrate how to calculate
# Canberra distance
# using dist() function
 
# Initializing a vector
vect1 <- c(1, 4, 8, 9, 2, 3)
 
# Initializing another vector
vect2 <- c(9, 4, 1, 2, 4, 7)
 
# Initializing another vector
vect3 <- c(1, 7, 9, 3, 2, 8)
 
 
# Initializing another vector
vect4 <- c(2, 1, 4, 7, 8, 9)
 
 
# Initializing another vector
vect5 <- c(1, 4, 8, 3, 9, 2)
 
# Initializing another vector
vect6 <- c(3, 7, 8, 6, 5, 9)
 
 
#Row bind vectors into a single matrix
twoDimensionalVect <- rbind(vect1, vect2, vect3,
                            vect4, vect5, vect6)
 
print("Canberra distance between each pair of vectors is: ")
cat("\n\n")
 
# Calculate Canberra distance between vectors using
# built in dist method By passing two-dimensional
# vector as a parameter Since we want to calculate
# Canberra distance between each unique pair of vectors
# That is why we are passing Canberra as a method
dist(twoDimensionalVect, method = "canberra", diag = TRUE, upper = TRUE)


Output:

Binary distance

The Binary distance between two vectors, A and B, is calculated as the proportion of elements that the two vectors share. 

Here,

  • vect1 is the first vector
  • vect2 is the second vector

Syntax:

dist(vect, method = “binary”, diag = TRUE or FALSE, upper = TRUE or FALSE)

Example: Binary distance

R




# R program to illustrate how to calculate
# Binary distance
# using dist() function
 
# Initializing a vector
vect1 <- c(1, 4, 8, 9, 2, 3)
 
# Initializing another vector
vect2 <- c(9, 4, 1, 2, 4, 7)
 
# Initializing another vector
vect3 <- c(1, 7, 9, 3, 2, 8)
 
 
# Initializing another vector
vect4 <- c(2, 1, 4, 7, 8, 9)
 
 
# Initializing another vector
vect5 <- c(1, 4, 8, 3, 9, 2)
 
# Initializing another vector
vect6 <- c(3, 7, 8, 6, 5, 9)
 
 
#Row bind vectors into a single matrix
twoDimensionalVect <- rbind(vect1, vect2, vect3,
                            vect4, vect5, vect6)
 
print("Binary distance between each pair of vectors is: ")
cat("\n\n")
 
# Calculate Binary distance between vectors using
# built in dist method By passing two-dimensional
# vector as a parameter Since we want to calculate
# Binary distance between each unique pair of vectors
# That is why we are passing Binary as a method
dist(twoDimensionalVect, method = "binary", diag = TRUE, upper = TRUE)


Output:

Minkowski Distance  

Minkowski distance is a distance measured between two points in N-dimensional space. It is a generalization of the Euclidean distance and the Manhattan distance. For example, In a 2-dimensional space having two points Point1 (x1 , y1) and Point2 (x2 , y2), the Minkowski distance is given by (|x1 – y1|p + |x2 – y2|p )1/p . In R Minkowski distance is calculated with respect to vectors. The Minkowski distance between the two vectors is given by,  

(Σ|vect1i - vect2i|p)1/p

where,

  • vect1 is the first vector
  • vect2 is the second vector
  • p is an integer

R provides an inbuilt dist() method to calculate Minkowski distance between each pair of vectors in a two-dimensional vector.

Syntax:

dist(vect, method = “minkowski”, p = integer, diag = TRUE or FALSE, upper = TRUE or FALSE) 

For example, we are given two vectors, vect1 as (3, 6, 8, 9) and vect2 as (2, 7, 7, 10). Their Minkowski distance is given by, ( |3 – 2|2 + |6 – 7|2 + |8 – 7|2 + |9 – 10|2 )1/2 which is equal to 2.

Example: Minkowski distance

R




# R program to illustrate how to calculate
# Minkowski distance
# using dist() function
 
# Initializing a vector
vect1 <- c(1, 4, 8, 9, 2, 3)
 
# Initializing another vector
vect2 <- c(9, 4, 1, 2, 4, 7)
 
# Initializing another vector
vect3 <- c(1, 7, 9, 3, 2, 8)
 
 
# Initializing another vector
vect4 <- c(2, 1, 4, 7, 8, 9)
 
 
# Initializing another vector
vect5 <- c(1, 4, 8, 3, 9, 2)
 
# Initializing another vector
vect6 <- c(3, 7, 8, 6, 5, 9)
 
 
#Row bind vectors into a single matrix
twoDimensionalVect <- rbind(vect1, vect2, vect3,
                            vect4, vect5, vect6)
 
print("Minkowski distance between each pair of vectors is: ")
cat("\n\n")
 
# Calculate Minkowski distance between vectors using
# built in dist method By passing two-dimensional
# vector as a parameter Since we want to calculate
# Minkowski distance between each unique pair of vectors
# That is why we are passing Minkowski as a method
dist(twoDimensionalVect, method = "minkowski", diag = TRUE, upper = TRUE p = 2)


Output:



Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads