An F-test produces an F-statistic. To find the p-value associated with an F-statistic in R, you can use the following command:
pf(fstat, df1, df2, lower.tail = FALSE)
- fstat – the value of the f-statistic
- df1 – degrees of freedom 1
- df2 – degrees of freedom 2
- lower.tail – whether or not to return the probability associated with the lower tail of the F distribution. This is TRUE by default.
For example, here is how to find the p-value associated with an F-statistic of 5, with degrees of freedom 1 = 3 and degrees of freedom 2 = 14:
pf(5, 3, 14, lower.tail = FALSE) #[1] 0.01457807
One of the most common uses of an F-test is for testing the overall significance of a regression model. In the following example, we show how to calculate the p-value of the F-statistic for a regression model.
Example: Calculating p-value from F-statistic
Suppose we have a dataset that shows the total number of hours studied, total prep exams taken, and final exam score received for 12 different students:
#create dataset data prep_exams = c(2, 6, 5, 2, 7, 4, 4, 2, 8, 4, 1, 3), final_score = c(76, 88, 96, 90, 98, 80, 86, 89, 68, 75, 72, 76)) #view first six rows of dataset head(data) # study_hours prep_exams final_score #1 3 2 76 #2 7 6 88 #3 16 5 96 #4 14 2 90 #5 12 7 98 #6 7 4 80
Next, we can fit a linear regression model to this data using study hours and prep exams as the predictor variables and final score as the response variable. Then, we can view the output of the model:
#fit regression model
model #view output of the model
summary(model)
#Call:
#lm(formula = final_score ~ study_hours + prep_exams, data = data)
#
#Residuals:
# Min 1Q Median 3Q Max
#-13.128 -5.319 2.168 3.458 9.341
#
#Coefficients:
# Estimate Std. Error t value Pr(>|t|)
#(Intercept) 66.990 6.211 10.785 1.9e-06 ***
#study_hours 1.300 0.417 3.117 0.0124 *
#prep_exams 1.117 1.025 1.090 0.3041
#---
#Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#
#Residual standard error: 7.327 on 9 degrees of freedom
#Multiple R-squared: 0.5308, Adjusted R-squared: 0.4265
#F-statistic: 5.091 on 2 and 9 DF, p-value: 0.0332
On the very last line of the output we can see that the F-statistic for the overall regression model is 5.091. This F-statistic has 2 degrees of freedom for the numerator and 9 degrees of freedom for the denominator. R automatically calculates that the p-value for this F-statistic is 0.0332.
In order to calculate this equivalent p-value ourselves, we could use the following code:
pf(5.091, 2, 9, lower.tail = FALSE) #[1] 0.0331947
Notice that we get the same answer (but with more decimals displayed) as the linear regression output above.