How to Calculate Cramer’s V in Python?

Cramer’s V: It is defined as the measurement of length between two given nominal variables. A nominal variable is a type of data measurement scale that is used to categorize the different types of data. Cramer’s V lies between 0 and 1 (inclusive). 0 indicates that the two variables are not linked by any relation. 1 indicates that there exists a strong association between the two variables. Cramer’s V can be calculated by using the below formula:

√(X²/N) / min(C-1, R-1)

Here,

X²: It is the Chi-square statistic

N: It represents the total sample size

R: It is equal to the number of rows

C: It is equal to the number of columns

Example 1:

Let us calculate Cramer’s V for a 3 × 3 Table.

Python3

# Load necessary packages and functions 

import scipy.stats as stats 

import numpy as np 

# Make a 3 x 3 table 

dataset = np.array([[13, 17, 11], [4, 6, 9], 

                    [20, 31, 42]]) 

# Finding Chi-squared test statistic, 
# sample size, and minimum of rows 
# and columns 

X2 = stats.chi2_contingency(dataset, correction=False)[0] 

N = np.sum(dataset) 

minimum_dimension = min(dataset.shape)-1

# Calculate Cramer's V 

result = np.sqrt((X2/N) / minimum_dimension) 

# Print the result 

print(result)

Output:

Output

The Cramers V comes out to be equal to 0.121 which clearly depicts the weak association between the two variables in the table.

Example 2:

We will now calculate Cramer’s V for larger tables and having unequal dimensions. The Cramers V comes out to be equal to 0.12 which clearly depicts the weak association between the two variables in the table.

Python3

# Load necessary packages and functions 

import scipy.stats as stats 

import numpy as np 

# Make a 5 x 4 table 

dataset = np.array([[4, 13, 17, 11], [4, 6, 9, 12], 

                    [2, 7, 4, 2], [5, 13, 10, 12], 

                    [5, 6, 14, 12]]) 

# Finding Chi-squared test statistic,  
# sample size, and minimum of rows and 
# columns 

X2 = stats.chi2_contingency(dataset, correction=False)[0] 

N = np.sum(dataset) 

minimum_dimension = min(dataset.shape)-1

# Calculate Cramer's V 

result = np.sqrt((X2/N) / minimum_dimension) 

# Print the result 

print(result)