Open In App

How to Perform Grubbs’ Test in Python

Last Updated : 23 Feb, 2022
Improve
Improve
Like Article
Like
Save
Share
Report

Prerequisites: Parametric and Non-Parametric Methods, Hypothesis Testing 

In this article, we will be discussing the different approaches to perform Grubbs’ Test in Python programming language. 

Grubbs’ Test is also known as the maximum normalized residual test or extreme studentized deviate test is a test used to detect outliers in a univariate data set assumed to come from a normally distributed population. This test is defined for the hypothesis:

  • Ho: There are no outliers in the data set
  • Ha: There is exactly one oiler in the database

Method 1: Performing two-side Grubbs’ Test 

In this method to perform the grubb’s test, the user needs to call the smirnov_grubbs.test() function from the outlier_utils package passed with the required data passed as the parameters.

Syntax: smirnov_grubbs.test(data, alpha)

Parameters:

  • data: A numeric vector of data values
  • alpha: The significance level to use for the test.

Example:

In this example, we are performing the two-sided Grubbs test, which will detect outliers on both ends of the dataset using the smirnov_grubbs.test() function in the python programming language.

Python




import numpy as np
from outliers import smirnov_grubbs as grubbs
 
# define data
data = np.array([20, 21, 26, 24, 29, 22,
                 21, 50, 28, 27])
 
# perform Grubbs' test
grubbs.test(data, alpha=.05)


Output:

array([20, 21, 26, 24, 29, 22, 21, 28, 27])

Method 2: Performing one-side Grubbs’ Test

In this approach to get the one-side grubb’s test, the user needs to call either grubbs.min_test() function to get the min. the outlier of the given data set or the grubbs.max_test() to get the max. outlier out from the given data set.

Syntax:

grubbs.min_test(data, alpha)

grubbs.max_test(data, alpha)

Example 1:

Under this example, we will be performing a one-side  Grubbs’ Test using the grubbs.min_test() function of the given data in the python programming language.

Python




import numpy as np
from outliers import smirnov_grubbs as grubbs
 
# define data
data = np.array([20, 21, 26, 24, 29,
                 22, 21, 50, 28, 27, 5])
 
print("Data after performing min one-side grubb's test: ")
 
# perform min Grubbs' test
grubbs.min_test(data, alpha=.05)


Output:

Data after performing min one-side grubb's test: 
array([20, 21, 26, 24, 29, 22, 21, 50, 28, 27,  5])

Example 2:

Under this example, we will be performing a one-side  Grubbs’ Test using the grubbs.max_test() function of the given data in the python programming language.

Python




import numpy as np
from outliers import smirnov_grubbs as grubbs
 
# define data
data = np.array([20, 21, 26, 24, 29, 22,
                 21, 50, 28, 27, 5])
 
print("Data after performing min one-side grubb's test: ")
 
# perform max Grubbs' test
grubbs.max_test(data, alpha=.05)


Output:

Data after performing min one-side grubb's test:
array([20, 21, 26, 24, 29, 22, 21, 28, 27,  5])

Method 3: Extract the Index of the Outlier using the gribb’s test

In this approach, the user needs to follow the below syntax to get the index at which the outlier is present of the given data. 

grubbs.max_test_indices() function: This function returns the index of the outlier present in the array.

Syntax: grubbs.max_test_indices(data,alpha)

Python




import numpy as np
from outliers import smirnov_grubbs as grubbs
 
# define data
data = np.array([20, 21, 26, 24, 29, 22,
                 21, 50, 28, 27, 5])
 
grubbs.max_test_indices(data, alpha=.05)


Output:

[7]

Method 4: Extract the value of the Outlier using the grubb’s test

In this approach, the user needs to follow the below syntax to get the value at which the outlier is present of the given data. 

grubbs.max_test_outlines() function: This function returns the value of the outlier present in the array.

grubbs.max_test_outlines(data,alpfa)

Python




import numpy as np
from outliers import smirnov_grubbs as grubbs
 
# define data
data = np.array([20, 21, 26, 24, 29, 22,
                 21, 50, 28, 27, 5])
 
grubbs.max_test_outliers(data, alpha=.05)


Output:

[50]


Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads