Open In App

How to Perform a Kruskal-Wallis Test in Python

Last Updated : 14 Dec, 2022
Like Article

Kruskal-Wallis test is a non-parametric test and an alternative to One-Way Anova. By non-parametric we mean, the data is not assumed to become from a particular distribution. The main objective of this test is used to determine whether there is a statistical difference between the medians of at least three independent groups. 


The Kruskal-Wallis Test has the null and alternative hypotheses as discussed below:

  • The null hypothesis (H0): The median is the same for all the data groups.
  • The alternative hypothesis: (Ha): The median is not equal for all the data groups.

Stepwise Implementation:

Let us consider an example in which the Research and Development team wants to determine if applying three different engine oils leads to the difference in the mileage of cars. The team decided to opt for 15 cars of the same brand and break down them into groups of three (5 cars in each group). Now each group is doped with exactly one engine oil (all three engine oils are used). Then they are allowed to run for 20 kilometers on the same track and once their journey gets ended, the mileage was noted down.

Step 1: Create the data

The very first step is to create data. We need to create three arrays that can hold cars’ mileage (one for each group).


data_group1 = [7, 9, 12, 15, 21]
data_group2 = [5, 8, 14, 13, 25]
data_group3 = [6, 8, 8, 9, 5]

Step 2: Perform the Kruskal-Wallis Test

Python provides us kruskal() function from the scipy.stats library using which we can conduct the Kruskal-Wallis test in Python easily.


# Import libraries
from scipy import stats
# Defining data groups
data_group1 = [7, 9, 12, 15, 21]
data_group2 = [5, 8, 14, 13, 25]
data_group3 = [6, 8, 8, 9, 5]
# Conduct the Kruskal-Wallis Test
result = stats.kruskal(data_group1, data_group2, data_group3)
# Print the result



Step 3: Analyze the results.

In this example, the test statistic comes out to be equal to 3.492 and the corresponding p-value is 0.174. As the p-value is not less than 0.05, we cannot reject the null hypothesis that the median mileage of cars is the same for all three groups. Hence, We don’t have sufficient proof to claim that the different types of engine oils used to lead to statistically significant differences in the mileage of cars.

Similar Reads

How to Perform Bartlett’s Test in Python?
Bartlett's test is used to check whether all samples have the same variance. it's also called Bartlett's test for homogeneity. Before executing some statistical tests like the One-Way ANOVA, Barlett's test ensures that the hypothesis of equal variances is correct. It's utilized when your data is almost certainly from a Gaussian distribution or norm
2 min read
How to Perform an Anderson-Darling Test in Python
Anderson-Darling test: Its full name is Anderson-Darling Goodness of Fit Test (AD-Test) and it is used to measure the extent to which our data fits with the specified distribution. Mostly this test is used to find whether the data follows a normal distribution. Syntax to install scipy and numpy library: pip3 install scipy numpy Scipy is a python li
3 min read
How to Perform Fisher’s Exact Test in Python
Fisher's exact test is a statistical test that determines if two category variables have non-random connections or we can say it's used to check whether two category variables have a significant relationship. In this article let's learn how to perform Fisher's exact test. In python fisher exact test can be performed using the fisher_exact() functio
2 min read
How to Perform a Breusch-Pagan Test in Python
Heteroskedasticity is a statistical term and it is defined as the unequal scattering of residuals. More specifically it refers to a range of measured values the change in the spread of residuals. Heteroscedasticity possesses a challenge because ordinary least squares (OLS) regression considers the residuals thrown out from a population having homos
4 min read
How to Perform McNemar’s Test in Python
McNemar's Test: This is a non-parametric test for the paired nominal data. This test is used when we want to find the change in proportion for the paired data. This test is also known as McNemar’s Chi-Square test. This is because the test statistic has a chi-square distribution. Assumptions for the McNemar Test: Below are the main assumptions for t
3 min read
How to Perform the Nemenyi Test in Python
Nemenyi Test: The Friedman Test is used to find whether there exists a significant difference between the means of more than two groups. In such groups, the same subjects show up in each group. If the p-value of the Friedman test turns out to be statistically significant then we can conduct the Nemenyi test to find exactly which groups are differen
3 min read
How to Perform an F-Test in Python
In statistics, Many tests are used to compare the different samples or groups and draw conclusions about populations. These techniques are commonly known as Statistical Tests or hypothesis Tests. It focuses on analyzing the likelihood or probability of obtaining the observed data that they are random or follow specific assumptions or hypotheses. Th
10 min read
Python | Perform operation on each key dictionary
Sometimes, while working with dictionaries, we might come across a problem in which we require to perform a particular operation on each value of keys. This type of problem can occur in web development domain. Let's discuss certain ways in which this task can be performed. Method #1 : Using loop This is the naive method in which this task can be pe
4 min read
How to Perform Multivariate Normality Tests in Python
In this article, we will be looking at the various approaches to perform Multivariate Normality Tests in Python. Multivariate Normality test is a test of normality, it determines whether the given group of variables comes from the normal distribution or not. Multivariate Normality Test determines whether or not a group of variables follows a multiv
3 min read
perform method - Action Chains in Selenium Python
Selenium’s Python Module is built to perform automated testing with Python. ActionChains are a way to automate low-level interactions such as mouse movements, mouse button actions, keypress, and context menu interactions. This is useful for doing more complex actions like hover over and drag and drop. Action chain methods are used by advanced scrip
2 min read