Open In App

ML | Types of Linkages in Clustering

Improve
Improve
Like Article
Like
Save
Share
Report

Prerequisites: Hierarchical Clustering

Hierarchical clustering is a versatile technique used in machine learning and data analysis for grouping similar data points into clusters. This process involves organizing the data points into a hierarchical structure, where clusters are either merged into larger clusters in a bottom-up approach (agglomerative) or divided into smaller clusters in a top-down approach (divisive). Regardless of the direction, the computation of distances between sub-clusters is crucial in hierarchical clustering.

The various types of linkages describe distinct methods for measuring the distance between two sub-clusters of data points, influencing the overall clustering outcome.

Single Linkage:

For two clusters R and S, the single linkage returns the minimum distance between two points i and j such that i belongs to R and j belongs to S.

[Tex]L(R, S) = min(D(i, j)), i\epsilon R, j\epsilon S[/Tex]

2. Complete Linkage:

For two clusters R and S, the complete linkage returns the maximum distance between two points i and j such that i belongs to R and j belongs to S.

[Tex]L(R, S) = max(D(i, j)), i\epsilon R, j\epsilon S[/Tex]

3. Average Linkage:

For two clusters R and S, first for the distance between any data-point i in R and any data-point j in S and then the arithmetic mean of these distances are calculated. Average Linkage returns this value of the arithmetic mean.

[Tex]L(R,S) = \frac{1}{n_{R}\times n_{S}}\sum_{i=1}^{n_{R}}\sum_{j=1}^{n_{S}} D(i,j), i\in R, j\in S[/Tex]

where,

  • [Tex]n_{R}[/Tex] : Number of data-points in R
  • [Tex]n_{S}[/Tex] : Number of data-points in S

Last Updated : 20 Mar, 2024
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads