Open In App

Homogeneity of Variance Test in R Programming

Improve
Improve
Like Article
Like
Save
Share
Report

In statistics, a sequence of random variables is homoscedastic if all its random variables have the same finite variance. This is also known as homogeneity of variance. In this article, let’s explain methods for checking the homogeneity of variances test in R programming across two or more groups. Some statistical tests, such as two independent samples T-test and ANOVA test, assume that variances are equal across groups. There are various variance tests that can be used to evaluate the equality of variances. These include:

  • F-test: It compares the variances of two groups. The data must be normally distributed in this test.
  • Bartlett’s test: It compares the variances of two or more groups. The data must be normally distributed in this test also.
  • Levene’s test: A robust alternative to Bartlett’s test that is less sensitive to deviations from normality.
  • Fligner-Killeen test: A non-parametric test that is very robust against departures from normality.

Preparing the Data Set

Before explaining each test let’s prepare and understand the data set first. Consider one of the standard learning data sets included in R is the “ToothGrowth” data set. The tooth growth data set is the length of the teeth in each of 10 guinea pigs at three vitamin C dosage levels (0.5, 1, and 2 mg) with two delivery methods (orange juice or ascorbic acid). The file contains 60 observations of 3 variables

  • len: Tooth length
  • supp: Supplement type (VC or OJ)
  • dose: Dose in milligrams

R




# Exploring the ToothGrowth data set
print(head(ToothGrowth, 10))
print(str(ToothGrowth))


Output:

    len  supp dose
1   4.2   VC  0.5
2  11.5   VC  0.5
3   7.3   VC  0.5
4   5.8   VC  0.5
5   6.4   VC  0.5
6  10.0   VC  0.5
7  11.2   VC  0.5
8  11.2   VC  0.5
9   5.2   VC  0.5
10  7.0   VC  0.5
'data.frame':    60 obs. of  3 variables:
 $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
 $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
 $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
NULL

F-test

It is used to compare the variances of the two groups. The F-test is used to evaluate whether the variances of two populations are equal or not. The data must be normally distributed in this test.

Statistical Hypothesis:

A hypothesis is a statement about the given problem. Hypothesis testing is a statistical method that is used in making a statistical decision using experimental data. Hypothesis testing is basically an assumption that we make about a population parameter. It evaluates two mutually exclusive statements about a population to determine which statement is best supported by the sample data. To know more about the statistical hypothesis please refer to Understanding Hypothesis Testing. For F-test the statistical hypotheses are:

  • Null Hypothesis: The variances of the two groups are equal
  • Alternative Hypothesis: The variances are different

Implementation in R:

With the help of var.test() method, one can perform the f-test between two normal populations with some hypothesis that variances of two populations are equal in R programming.

Syntax:

var.test(formula, dataset)

 

Parameters:

formula: a formula of the form values ~ groups

dataset: a matrix or data frame

Example:

R




# R program to illustrate
# F-test
  
# Using var.test()
result = var.test(len ~ supp, data = ToothGrowth)
  
# print the result
print(result)


Output:

F test to compare two variances

data:  len by supp
F = 0.6386, num df = 29, denom df = 29, p-value = 0.2331
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.3039488 1.3416857
sample estimates:
ratio of variances 
         0.6385951 

Interpretation:

The p-value is p = 0.2 which is greater than the significance level 0.05. In conclusion, there is no significant difference between the two variances.

Bartlett’s test

Bartlett’s test is used to test if k samples are from populations with equal variances. Equal variances across populations are called homoscedasticity or homogeneity of variances. Some statistical tests, for example, the ANOVA test, assume that variances are equal across groups or samples. The Bartlett test can be used to verify that assumption. Bartlett’s test enables us to compare the variance of two or more samples to decide whether they are drawn from populations with equal variance. It is fitting for normally distributed data.

Statistical Hypothesis:

  • Null Hypothesis: All populations variances are equal
  • Alternative Hypothesis: At least two of them differ

Implementation in R:

The R provides a function bartlett.test() which is available in stats package can be used to compute Barlett’s test. The syntax for this function is given below:

Syntax:

bartlett.test(formula, dataset)

 

Parameters:

formula: a formula of the form values ~ groups

dataset: a matrix or data frame

Example:

R




# R program to illustrate
# Barlett’s test
  
# Using bartlett.test()
result = bartlett.test(len ~ supp, data = ToothGrowth)
  
# print the result
print(result)


Output:

Bartlett test of homogeneity of variances

data:  len by supp
Bartlett's K-squared = 1.4217, df = 1, p-value = 0.2331

Levene’s test

In statistics, Levene’s test is an inferential statistic used to evaluate the equality of variances for a variable determined for two or more groups. Some standard statistical procedures find that variances of the populations from which various samples are formed are equal. Levene’s test assesses this assumption. It examines the null hypothesis that the population variances are equal called homogeneity of variance or homoscedasticity. It compares the variances of k samples, where k can be more than two samples. It’s an alternative to Bartlett’s test that is less sensitive to departures from normality. 

Statistical Hypothesis:

  • Null Hypothesis: All populations variances are equal
  • Alternative Hypothesis: At least two of them differ

Implementation in R:

The R provides a function leveneTest() which is available in car package that can be used to compute Levene’s test. The syntax for this function is given below:

Syntax:

leveneTest(formula, dataset)

 

Parameters:

formula: a formula of the form values ~ groups

dataset: a matrix or data frame

Example:

R




# R program to illustrate
# Levene's test
  
# Import required package
library(car)
  
# Using leveneTest()
result = leveneTest(len ~ supp, data = ToothGrowth)
  
# print the result
print(result)


Output:

Levene's Test for Homogeneity of Variance (center = median)
      Df F value Pr(>F)
group  1  1.2136 0.2752
      58     

Fligner-Killeen test

The Fligner-Killeen test is a non-parametric test for homogeneity of group variances based on ranks. It is useful when the data are non-normally distributed or when problems related to outliers in the dataset cannot be resolved. It is also one of the many tests for homogeneity of variances which is most robust against departures from normality.

Statistical Hypothesis:

  • Null Hypothesis: All populations variances are equal
  • Alternative Hypothesis: At least two of them differ

Implementation in R:

The R provides a function fligner.test() which is available in stats package that can be used to compute the Fligner-Killeen test. The syntax for this function is given below:

Syntax:

fligner.test(formula, dataset)

 

Parameters:

formula: a formula of the form values ~ groups

dataset: a matrix or data frame

Example:

R




# R program to illustrate
# Fligner-Killeen test
  
# Import required package
library(stats)
  
# Using fligner.test()
result = fligner.test(len ~ supp, data = ToothGrowth)
  
# print the result
print(result)


Output:

Fligner-Killeen test of homogeneity of variances

data:  len by supp
Fligner-Killeen:med chi-squared = 0.97034, df = 1, p-value = 0.3246



Last Updated : 12 Oct, 2020
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads