Open In App

How to Perform Dunn’s Test in Python

Improve
Improve
Like Article
Like
Save
Share
Report

Dunn’s test should be used to establish which groups are distinct If the Kruskal-Wallis test yields statistically significant findings. After your  ANOVA has revealed a noticeable difference in three or more means, you may apply Dunn’s Test to determine which particular means are different from the rest. Dunn’s Multiple Comparison Test is a non-parametric post hoc, non-parametric test that doesn’t presume your data comes from a certain distribution.

To perform the duns test user neesneedsds to call the  posthoc_dunn() function from the scikit-posthocs library. 

posthoc_dunn() Function:

Syntax:

scikit_posthocs.posthoc_dunn(a, val_col: str = None, group_col: str = None, p_adjust: str = None, sort: bool = True)

Parameters:

  • a : it’s an array type object or a dataframe object or series.
  •  group_col : column of the predictor or the dependent variable
  • p_adjust: P values can be adjusted using this method. it’s a string type possible values are :
    • ‘bonferroni’
    • hommel
    • holm-sidak
    • holm
    • simes-hochberg and more…

Returns: p-values.

Syntax to install posthocs library:

pip install scikit-posthocs

This is a hypotheses test and the two hypotheses are as follows:

  • Null hypothesis:  The given sample have the same median
  • Alternative hypothesis:  The given sample has a different median.

In this example, we import the packages, read the iris CSV  file, and use posthoc_dunn() function to perform dunns test. dunn’s test is performed on the sepal width of the three plant species. 

Click here to view and download the CSV file.

Python3




# importing packages and modules
import pandas as pd
import scikit_posthocs as sp
 
# reading CSV file
dataset= pd.read_csv('iris.csv')
 
# data which contains sepal width of the three species
data = [dataset[dataset['species']=="setosa"]['sepal_width'],
        dataset[dataset['species']=="versicolor"]['sepal_width'],
        dataset[dataset['species']=="virginica"]['sepal_width']]
 
# using the posthoc_dunn() function
p_values= sp.posthoc_dunn(data, p_adjust = 'holm')
 
print(p_values)


Output:

  • For the difference between groups 1 and 2, the adjusted p-value is 3.247311e-14
  • For the difference between groups 2 and 3, the adjusted p-value is 1.521219e-02

We further check if p_values are higher than the level of significance. false represents that two groups are statistically significant or that the null hypothesis is rejected.

Python3




p_values > 0.05


 
 

Output:

 

 

We take the level of significance to be 0.05 in this example. no two groups (species)  are statistically significant as no two groups have a p_value more than 0.05. hence, we can say the null hypothesis is false, and the alternative hypothesis is true. 

 



Last Updated : 19 Apr, 2022
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads