**Prerequisite: ** K means Clustering – Introduction

K-Means Algorithm has a few limitations which are as follows:

- It only identifies spherical shaped clusters i.e it cannot identify, if the clusters are non-spherical or of various size and density.
- It suffers from local minima and has a problem when the data contains outliers.

Bisecting K-Means Algorithm is a modification of the K-Means algorithm. It can produce partitional/hierarchical clustering. It can recognize clusters of any shape and size. This algorithm is convenient.

- It beats K-Means in entropy measurement.

Note: Entropy measurement is a measure of the randomness in the data being processed. Example: Flipping a coin. If the entropy of the given data, being processed is high, it is difficult is to draw conclusions from that data. ]

**Example 1 —** Say, you walk into a hall consisting of many chairs, and sit on, one of them without observing any other chair in particular or thinking if you could hear to the speaker in the stage from the chair you are seated. This kind of approach can be called ‘K-Means’.

**Example 2 — **Say, you walk into a hall consisting of many chairs. You observe all the chairs, and decide which one to occupy, on the basis of various factors like-if you could hear the speaker from that position and if the chair is placed near the air conditioner. This kind of approach can be called ‘Bisecting K-Means’.

**Bisecting K-Means Algorithm:**

- Initialize the list of clusters to accommodate the cluster consisting of all points.
**repeat**- Discard a cluster from the list of clusters.
- { Perform several “trial” bisections of the selected cluster. }
**for***i*= 1 to*number of trials***do**- Bisect the selected clusters using basic K-means.
**end for**- Select the 2 clusters from the bisection with the least total SSE.
**until**Until the list of clusters contain*‘K’*clusters

- The working of this algorithm can be condensed in two steps.
- Firstly, let us assume the number of clusters required at the final stage, ‘K’ = 3 (Any value can be assumed, if not mentioned).

**Step 01: **

- All points/objects/instances are put into 1 cluster.

**Step 02:**

- Apply K-Means (K=3). The cluster ‘GFG’ is split into two clusters ‘GFG1’ and ‘GFG2’. The required number of clusters aren’t obtained yet. Thus, ‘GFG1’ is further split into two (since it has higher SSE (formula to calculate SSE is explained below))

- In the above diagram, as we split the cluster ‘GFG’ into ‘GFG1’ and ‘GFG2’, we calculate the SSE of the two clusters separately using the above formula. The cluster, with the higher SSE will be split further. The cluster, with the lower SSE contains lesser errors comparatively, and hence won’t be split further.
- Here, if we get upon calculation, that the cluster ‘GFG1’ is the one with higher SSE, we split it into (GFG1)` and (GFG1)`. The number of clusters required at the final stage is mentioned to be ‘3’, and we have obtained 3 clusters.
- If the required number of clusters is not obtained, we should continue splitting until they are produced.

**Example:**

Consider a cluster ‘G’ = ( P1, P2, P3, P4, P5, P6, P7, P8, P9, P10 ) , consisting of 10 points. K=3 (given)

- Applying the Bisecting K-Means Algorithm, the cluster ‘G’, as shown in [A]th step is split into two clusters – ‘G1’ and ‘G2’, as shown in [B]th step. The total number of clusters required at the final stage i.e ‘K’=3 (given).
- Since the required clusters are not obtained yet, we should split one of the two clusters obtained.
- The cluster with the higher SSE is selected ( since the cluster with lower SSE is less erroneous). Here, the cluster with higher SSE is the cluster ‘G1’. It is split into (G1)` and (G1)“ respectively.
- Thus, the required number of clusters is obtained.

## Recommended Posts:

- Minimax Algorithm in Game Theory | Set 3 (Tic-Tac-Toe AI - Finding optimal move)
- Election algorithm and distributed processing
- Algorithm to generate positive rational numbers
- Apriori Algorithm
- Liang-Barsky Algorithm
- Linde-Buzo-Gray (LBG) Algorithm
- Detecting top nodes on a Social Network - The VoteRank Algorithm
- Basic Understanding of CURE Algorithm
- Basic understanding of Jarvis-Patrick Clustering Algorithm
- The SON Algorithm and Map - Reduce
- The Multistage Algorithm in Data Analytics
- Introduction to quantum computing
- Introduction to Data Science : Skills Required
- Java | MIDI Introduction
- Introduction to Deep Learning
- Introduction to Blockchain technology | Set 1
- Introduction to Blockchain technology | Set 2
- Introduction to Internet of Things (IoT) | Set 1
- Introduction to Programming Languages
- Introduction to Signals and Systems: Properties of systems

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.