How to Perform Dunn’s Test in Python
Dunn’s test should be used to establish which groups are distinct If the Kruskal-Wallis test yields statistically significant findings. After your ANOVA has revealed a noticeable difference in three or more means, you may apply Dunn’s Test to determine which particular means are different from the rest. Dunn’s Multiple Comparison Test is a non-parametric post hoc, non-parametric test that doesn’t presume your data comes from a certain distribution.
To perform the duns test user neesneedsds to call the posthoc_dunn() function from the scikit-posthocs library.
posthoc_dunn() Function:
Syntax:
scikit_posthocs.posthoc_dunn(a, val_col: str = None, group_col: str = None, p_adjust: str = None, sort: bool = True)
Parameters:
- a : it’s an array type object or a dataframe object or series.
- group_col : column of the predictor or the dependent variable
- p_adjust: P values can be adjusted using this method. it’s a string type possible values are :
- ‘bonferroni’
- hommel
- holm-sidak
- holm
- simes-hochberg and more…
Returns: p-values.
Syntax to install posthocs library:
pip install scikit-posthocs
This is a hypotheses test and the two hypotheses are as follows:
- Null hypothesis: The given sample have the same median
- Alternative hypothesis: The given sample has a different median.
In this example, we import the packages, read the iris CSV file, and use posthoc_dunn() function to perform dunns test. dunn’s test is performed on the sepal width of the three plant species.
Click here to view and download the CSV file.
Python3
import pandas as pd
import scikit_posthocs as sp
dataset = pd.read_csv( 'iris.csv' )
data = [dataset[dataset[ 'species' ] = = "setosa" ][ 'sepal_width' ],
dataset[dataset[ 'species' ] = = "versicolor" ][ 'sepal_width' ],
dataset[dataset[ 'species' ] = = "virginica" ][ 'sepal_width' ]]
p_values = sp.posthoc_dunn(data, p_adjust = 'holm' )
print (p_values)
|
Output:
- For the difference between groups 1 and 2, the adjusted p-value is 3.247311e-14
- For the difference between groups 2 and 3, the adjusted p-value is 1.521219e-02
We further check if p_values are higher than the level of significance. false represents that two groups are statistically significant or that the null hypothesis is rejected.
Output:
We take the level of significance to be 0.05 in this example. no two groups (species) are statistically significant as no two groups have a p_value more than 0.05. hence, we can say the null hypothesis is false, and the alternative hypothesis is true.
Last Updated :
19 Apr, 2022
Like Article
Save Article
Share your thoughts in the comments
Please Login to comment...