An F-test is used to test whether two population variances are equal. The null and alternative hypotheses for the test are as follows:
H0: σ12 = σ22 (the population variances are equal)
H1: σ12 ≠ σ22 (the population variances are not equal)
This tutorial explains how to perform an F-test in Python.
Example: F-Test in Python
Suppose we have the following two samples:
x = [18, 19, 22, 25, 27, 28, 41, 45, 51, 55] y = [14, 15, 15, 17, 18, 22, 25, 25, 27, 34]
We can use the following function to perform an F-test to determine if the two populations these samples came from have equal variances:
import numpy as np #define F-test function def f_test(x, y): x = np.array(x) y = np.array(y) f = np.var(x, ddof=1)/np.var(y, ddof=1) #calculate F test statistic dfn = x.size-1 #define degrees of freedom numerator dfd = y.size-1 #define degrees of freedom denominator p = 1-scipy.stats.f.cdf(f, dfn, dfd) #find p-value of F test statistic return f, p #perform F-test f_test(x, y) (4.38712, 0.019127)
The F test statistic is 4.38712 and the corresponding p-value is 0.019127. Since this p-value is less than .05, we would reject the null hypothesis. This means we have sufficient evidence to say that the two population variances are not equal.
Notes
- The F test statistic is calculated as s12 / s22. By default, numpy.var calculates the population variance. To calculate the sample variance, we need to specify ddof=1.
- The p-value corresponds to 1 – cdf of the F distribution with numerator degrees of freedom = n1-1 and denominator degrees of freedom = n2-1.
- This function only works when the first sample variance is larger than the second sample variance. Thus, define the two samples in such a way that they work with the function.
When to Use the F-Test
The F-test is typically used to answer one of the following questions:
1. Do two samples come from populations with equal variances?
2. Does a new treatment or process reduce the variability of some current treatment or process?
Related: How to Perform an F-Test in R