Open In App

How to Perform a Shapiro-Wilk Test in Python

Last Updated : 30 Oct, 2022
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will be looking at the various approaches to perform a Shapiro-wilk test in Python.

Shapiro-Wilk test is a test of normality, it determines whether the given sample comes from the normal distribution or not. Shapiro-Wilk’s test or Shapiro test is a normality test in frequentist statistics. The null hypothesis of Shapiro’s test is that the population is distributed normally.

Shapiro-Wilk test using shapiro() function

In this approach, the user needs to call the shapiro() function with the required parameters from the scipy.stats library to conduct the Shapiro-Wilk test on the given data in the python programming language.

Syntax: shapiro(x)

Parameters:

  • x: Array of sample data.

Returns:

   statistic: The test statistic.

   p-value: The p-value for the hypothesis test.

This is a hypotheses test and the two hypotheses are as follows:

  • Ho(Accepted): Sample is from the normal distributions.(Po>0.05)
  • Ha(Rejected): Sample is not from the normal distributions.

Example 1: Shapiro-Wilk test on the normally distributed sample in Python

In this example, we will be simply using the shapiro() function from the scipy.stats library to Conduct a Shapiro-Wilk test on the randomly generated data with 500 data points in python.

Python3




# import useful library
import numpy as np
from scipy.stats import shapiro
from numpy.random import randn
 
# Create data
gfg_data = randn(500)
 
# conduct the  Shapiro-Wilk Test
shapiro(gfg_data)


Output:

(0.9977102279663086, 0.7348126769065857)

Output Interpretation:

Since in the above example, the p-value is 0.73 which is more than the threshold(0.05) which is the alpha(0.05) then we fail to reject the null hypothesis i.e. we do not have sufficient evidence to say that sample does not come from a normal distribution.

Example 2: Shapiro-Wilk test on not normally distributed sample in Python

In this example, we will be simply using the shapiro() function from the scipy.stats library to Conduct a Shapiro-Wilk test on the randomly generated data from the passion distribution data with 100 data points in Python.

Python3




# import useful library
import numpy as np
from numpy.random import poisson
from numpy.random import seed
from scipy.stats import shapiro
from numpy.random import randn
 
seed(0)
# Create data
gfg_data = poisson(5, 200)
 
# conduct the  Shapiro-Wilk Test
shapiro(gfg_data)


Output:

(0.966901957988739, 0.00011927181185455993)

Output Interpretation:

Since in the above example, the p-value is 0.0001 which is less than the alpha(0.05) then we reject the null hypothesis i.e. we have sufficient evidence to say that sample does not come from a normal distribution.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads