An entirely homogeneous clustering is one where each cluster has information that directs a place toward a similar class label. Homogeneity portrays the closeness of the clustering algorithm to this (homogeneity_score) perfection.
This metric is autonomous of the outright values of the labels. A permutation of the cluster label values won’t change the score value in any way.
Syntax : sklearn.metrics.homogeneity_score(labels_true, labels_pred)
The Metric is not symmetric, switching label_true with label_pred will return the completeness_score.
Parameters :
- labels_true:<int array, shape = [n_samples]> : It accept the ground truth class labels to be used as a reference.
- labels_pred: <array-like of shape (n_samples,)>: It accepts the cluster labels to evaluate.
Returns:
homogeneity:<float>: Its return the score between 0.0 and 1.0 stands for perfectly homogeneous labeling.
Example1:
Python3
import pandas as pd import matplotlib.pyplot as plt from sklearn.cluster import KMeans from sklearn.metrics import homogeneity_score # Changing the location file # cd C:\Users\Dev\Desktop\Credit Card Fraud # Loading the data df = pd.read_csv( 'creditcard.csv' ) # Separating the dependent and independent variables y = df[ 'Class' ] X = df.drop( 'Class' , axis = 1 ) # Building the clustering model kmeans = KMeans(n_clusters = 2 ) # Training the clustering model kmeans.fit(X) # Storing the predicted Clustering labels labels = kmeans.predict(X) # Evaluating the performance homogeneity_score(y, labels) |
Output:
0.00496764949717645
Example 2: Perfectly homogeneous:
Python3
from sklearn.metrics.cluster import homogeneity_score # Evaluate the score hscore = homogeneity_score([ 0 , 1 , 0 , 1 ], [ 1 , 0 , 1 , 0 ]) print (hscore) |
Output:
1.0
Example 3: Non-perfect labelings that further split classes into more clusters can be perfectly homogeneous:
Python3
from sklearn.metrics.cluster import homogeneity_score # Evaluate the score hscore = homogeneity_score([ 0 , 0 , 1 , 1 ], [ 0 , 1 , 2 , 3 ]) print (hscore) |
Output:
0.9999999999999999
Example 4: Include samples from different classes don’t make for homogeneous labeling:
Python3
from sklearn.metrics.cluster import homogeneity_score # Evaluate the score hscore = homogeneity_score([ 0 , 0 , 1 , 1 ], [ 0 , 1 , 0 , 1 ]) print (hscore) |
Output:
0.0
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course.