Open In App

Bartlett’s Test in R Programming

Last Updated : 25 Aug, 2020
Improve
Improve
Like Article
Like
Save
Share
Report

In statistics, Bartlett’s test is used to test if k samples are from populations with equal variances. Equal variances across populations are called homoscedasticity or homogeneity of variances. Some statistical tests, for example, the ANOVA test, assume that variances are equal across groups or samples. The Bartlett test can be used to verify that assumption. Bartlett’s test enables us to compare the variance of two or more samples to decide whether they are drawn from populations with equal variance. It is fitting for normally distributed data. There are several solutions to test for the equality (homogeneity) of variance across groups, including:

  • F-test
  • Bartlett’s test
  • Levene’s test
  • Fligner-Killeen test

It is very much easy to perform these tests in R programming. In this article let’s perform Bartlett’s test in R.

Statistical Hypotheses for Bartlett’s test

A hypothesis is a statement about the given problem. Hypothesis testing is a statistical method that is used in making a statistical decision using experimental data. Hypothesis testing is basically an assumption that we make about a population parameter. It evaluates two mutually exclusive statements about a population to determine which statement is best supported by the sample data. To know more about the statistical hypothesis please refer to Understanding Hypothesis Testing. For Bartlett’s test the statistical hypotheses are:

  • Null Hypothesis: all populations variances are equal
  • Alternative Hypothesis: At least two of them differ

Implementation in R

R provides a function bartlett.test() which is available in stats package can be used to compute Barlett’s test. The syntax for this function is given below:

Syntax:

bartlett.test(formula, dataset)

 

Parameters:

formula: a formula of the form values ~ groups

dataset: a matrix or data frame

 

Returns:

statistic: Bartlett’s K-squared test statistic

parameter: the degrees of freedom of the approximate chi-squared distribution of the test statistic.

p.value: the p-value of the test

There may arise two cases depending upon the format of data. And we have to apply the different formulas for these two different formats of data.

If data is in the stacked form: Data is in stacked form means the values for both samples stored in one variable, so in this case, use the following command:

bartlett.test(values ~ groups, dataset)

where:

values: the name of the variable containing the data values

groups: the name of the variable that specifies which sample each value belongs too

If data is in the unstacked form: Data is in unstacked form means the samples stored in a separate variable, so in this case, nest the variable names inside the list() function as shown below:

bartlett.test(list(dataset$sample1, dataset$sample2, dataset$sample3))

Examples for Bartlett’s test

Bartlett’s test with one independent variable:

Consider the R’s inbuilt PlantGrowth dataset that gives the dried weight of three groups of ten batches of plants, wherever every group of ten batches got a different treatment. The weight variable gives the weight of the batch and the group variable gives the treatment received either ctrl, trt1 or trt2. To view the data set please type below command:

R




print(PlantGrowth)


Output:

    weight group
1    4.17  ctrl
2    5.58  ctrl
3    5.18  ctrl
4    6.11  ctrl
5    4.50  ctrl
6    4.61  ctrl
7    5.17  ctrl
8    4.53  ctrl
9    5.33  ctrl
10   5.14  ctrl
11   4.81  trt1
12   4.17  trt1
13   4.41  trt1
14   3.59  trt1
15   5.87  trt1
16   3.83  trt1
17   6.03  trt1
18   4.89  trt1
19   4.32  trt1
20   4.69  trt1
21   6.31  trt2
22   5.12  trt2
23   5.54  trt2
24   5.50  trt2
25   5.37  trt2
26   5.29  trt2
27   4.92  trt2
28   6.15  trt2
29   5.80  trt2
30   5.26  trt2

Suppose one wants to use Bartlett’s test to determine whether the variance in weight is the same for all treatment groups at a significance level of 0.05. Here let’s consider only one independent variable. To perform the test, use the below command:

R




# R program to illustrate
# Bartlett’s test
  
# Using bartlett.test()
result = bartlett.test(weight~group, PlantGrowth)
  
# print the result
print(result)


Output:

Bartlett test of homogeneity of variances

data:  weight by group
Bartlett's K-squared = 2.8786, df = 2, p-value = 0.2371

Explanation:

From the output, it can be seen that the p-value of 0.2371 is not less than the significance level of 0.05. This means the null hypothesis can not be rejected that the variance is the same for all treatment groups. This concludes that there is no proof to recommend that the variance in plant growth is different for the three treatment groups.

Bartlett’s test with multiple independent variables:

If one wants to do the test with multiple independent variables then the interaction() function must be used to collapse multiple factors into a single variable containing all combinations of the factors. Here let’s take the R’s inbuilt ToothGrowth data set.

R




# R program to illustrate
# Bartlett’s test
  
# Print the first 10 rows
# of the data set
print(head(ToothGrowth, 10))
  
# Applying bartlett.test()
result = bartlett.test(len ~ interaction(supp, dose), 
                                  data = ToothGrowth)
  
# Print the result
print(result)


Output:

    len supp dose
1   4.2   VC  0.5
2  11.5   VC  0.5
3   7.3   VC  0.5
4   5.8   VC  0.5
5   6.4   VC  0.5
6  10.0   VC  0.5
7  11.2   VC  0.5
8  11.2   VC  0.5
9   5.2   VC  0.5
10  7.0   VC  0.5

    Bartlett test of homogeneity of variances

data:  len by interaction(supp, dose)
Bartlett's K-squared = 6.9273, df = 5, p-value = 0.2261



Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads