Skip to content
Related Articles

Related Articles

Improve Article
Save Article
Like Article

Chi-square distance in Python

  • Last Updated : 19 Jul, 2021

Chi-square distance calculation is a statistical method, generally measures similarity between 2 feature matrices. Such distance is generally used in many applications like similar image retrieval, image texture, feature extractions etc. The Chi-square distance of 2 arrays ‘x’ and ‘y’ with ‘n’ dimension is mathematically calculated using below formula : 
 

 Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.  

To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning - Basic Level Course

In this article, we will learn how to calculate Chi-square distance using Python. Below given 2 different methods for calculating Chi-square Distance. Let’s see both of them with examples. 
Method #1: Calculating Chi – square distance manually using above formula. 
 



Python3




# importing numpy library
import numpy as np
 
# Function to calculate Chi-distance
def chi2_distance(A, B):
 
    # compute the chi-squared distance using above formula
    chi = 0.5 * np.sum([((a - b) ** 2) / (a + b)
                      for (a, b) in zip(A, B)])
 
    return chi
 
# main function
if __name__== "__main__":
    a = [1, 2, 13, 5, 45, 23]
    b = [67, 90, 18, 79, 24, 98]
 
    result = chi2_distance(a, b)
    print("The Chi-square distance is :", result)
Input : a = [1, 2, 13, 5, 45, 23]
        b = [67, 90, 18, 79, 24, 98] 
Output : The Chi-square distance is : 133.55428601494035

Input : a = [91, 900, 78, 30, 602, 813]
        b = [57, 49, 36, 759, 234, 928]
Output :  The Chi-square distance is : 814.776999405035

  
 
Method #2: Using scipy.stats.chisquare() method
 

Syntax: scipy.stats.chisquare(f_obs, f_exp=None, ddof=0, axis=0) 
Parameters: 
==> f_obs : array1 
==> f_exp : array2, optional 
==> ddof(Delta degrees of freedom – adjustment for p-value) : int, optional 
==> axis : int or None, optional 
The default value of ddof and axis is 0.
Returns: 
==> chisq : float or ndarray 
==> p-value of the test : float or ndarray 
 

 

Python3




# importing scipy
from scipy.stats import chisquare
 
k = [3, 4, 6, 2, 9, 5, 2]
print(chisquare(k))

Output : 
 

Power_divergenceResult(statistic=8.516129032258064, pvalue=0.20267440425509237)

 




My Personal Notes arrow_drop_up
Recommended Articles
Page :