Open In App

Welch’s t-Test in Python

Last Updated : 21 Feb, 2022
Improve
Improve
Like Article
Like
Save
Share
Report

Welch’s t-Test: Two sample t-Test is used to compare the means of two different independent datasets. But we can apply a Two-Sample T-Test on those data groups that share the same variance. Now to compare two data groups having different variances we use Welch’s t-Test. It is regarded as the parametric equivalent of the Two-Sample T-test. 

The user needs to install and import the following libraries to perform Welch’s t-Test in Python:

  • scipy
  • numpy

Syntax to install all the above packages:

pip3 install scipy numpy

Conducting Welch’s t-Test is a step by step process and these are described below, 

Step 1: Import the library.

The first step is to import the libraries installed above.

Python3




# Importing libraries
import scipy.stats as stats
import numpy as np


Step 2: Creating data groups. 

Let us consider an example, we are given two-sample data, each containing heights of 10 students of a class. We need to check whether two different class students have the same mean height. We can create data groups using numpy.array() method.

Python3




# Creating data groups
data_group1 = np.array([14, 15, 15, 16, 13, 8, 14,
                        17, 16, 14, 19, 20, 21, 15,
                        15])
data_group2 = np.array([36, 37, 44, 27, 24, 28, 27,
                        39, 29, 24, 37, 32, 24, 26,
                        33])


Step 3: Check the variance.

Before actually conducting Welch’s t-Test we need to find if the given data groups have the same variance. If the ratio of the larger data groups to the small data group is greater than 4:1 then we can consider that the given data groups have unequal variance. To find the variance of a data group, we can use the below syntax,

Syntax: 

print(np.var(data_group))

Here,

data_group: The given data group

Python3




# Python program to display variance 
# of data groups
  
# Import library
import scipy.stats as stats
import numpy as np
  
# Creating data groups
data_group1 = np.array([14, 15, 15, 16, 13, 8, 14,
                        17, 16, 14, 19, 20, 21, 15,
                        15])
data_group2 = np.array([36, 37, 44, 27, 24, 28, 27,
                        39, 29, 24, 37, 32, 24, 26,
                        33])
  
# Print the variance of both data groups
print(np.var(data_group1), np.var(data_group2))


Output:

variance

Here, the ratio is greater than 4: 1 hence the variance is different. So, we can apply Welch’s t-test.

Step 4: Conducting Welch’s t-Test.

Syntax:

ttest_ind(data_group1, data_group2, equal_var= False)

Here,

data_group1: First data group

data_group2: Second data group

equal_var = “False”: The Welch’s t-test will be conducted by not taking into consideration the equal population variances.

Example:

Python3




# Python program to conduct Welch's t-Test
  
# Import library
import scipy.stats as stats
import numpy as np
  
# Creating data groups
data_group1 = np.array([14, 15, 15, 16, 13, 8, 14,
                        17, 16, 14, 19, 20, 21, 15,
                        15])
data_group2 = np.array([36, 37, 44, 27, 24, 28, 27,
                        39, 29, 24, 37, 32, 24, 26,
                        33])
  
# Conduct Welch's t-Test and print the result
print(stats.ttest_ind(data_group1, data_group2, equal_var = False))


Output:

Welch’s t-Test

Interpretation of the Output:

The test statistic turns out to be -8.658 and the corresponding p-value is 2.757e-08. Here the p-value is less than 0.05 hence we could reject the null hypothesis of the test and the conclusion that the difference between the mean exam score of both types of students is quite significant.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads