Whenever we fit a linear regression model in R, the model takes on the following form:
Y = β0 + β1X + … + βiX +ϵ
where ϵ is an error term that is independent of X.
No matter how well X can be used to predict the values of Y, there will always be some random error in the model. One way to measure the dispersion of this random error is to use the residual standard error, which is a way to measure the standard deviation of the residuals ϵ.
The residual standard error of a regression model is calculated as:
Residual standard error = √SSresiduals / dfresiduals
where:
- SSresiduals: The residual sum of squares.
- dfresiduals: The residual degrees of freedom, calculated as n – k – 1 where n = total observations and k = total model parameters.
There are three methods we can use to calculate the residual standard error of a regression model in R.
Method 1: Analyze the Model Summary
The first way to obtain the residual standard error is to simply fit a linear regression model and then use the summary() command to obtain the model results. Then, just look for “residual standard error” near the bottom of the output:
#load built-in mtcars dataset data(mtcars) #fit regression model model #view model summary summary(model) Call: lm(formula = mpg ~ disp + hp, data = mtcars) Residuals: Min 1Q Median 3Q Max -4.7945 -2.3036 -0.8246 1.8582 6.9363 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 30.735904 1.331566 23.083 3.127 on 29 degrees of freedom Multiple R-squared: 0.7482, Adjusted R-squared: 0.7309 F-statistic: 43.09 on 2 and 29 DF, p-value: 2.062e-09
We can see that the residual standard error is 3.127.
Method 2: Use a Simple Formula
Another way to obtain the residual standard error (RSE) is to fit a linear regression model and then use the following formula to calculate RSE:
sqrt(deviance(model)/df.residual(model))
Here is how to implement this formula in R:
#load built-in mtcars dataset data(mtcars) #fit regression model model #calculate residual standard error sqrt(deviance(model)/df.residual(model)) [1] 3.126601
We can see that the residual standard error is 3.126601.
Method 3: Use a Step-By-Step Formula
Another way to obtain the residual standard error is to fit a linear regression model and then use a step-by-step approach to calculate each individual component of the formula for RSE:
#load built-in mtcars dataset data(mtcars) #fit regression model model #calculate the number of model parameters - 1 k=length(model$coefficients)-1 #calculate sum of squared residuals SSE=sum(model$residuals**2) #calculate total observations in dataset n=length(model$residuals) #calculate residual standard error sqrt(SSE/(n-(1+k))) [1] 3.126601
We can see that the residual standard error is 3.126601.
How to Interpret the Residual Standard Error
As mentioned before, the residual standard error (RSE) is a way to measure the standard deviation of the residuals in a regression model.
The lower the value for RSE, the more closely a model is able to fit the data (but be careful of overfitting). This can be a useful metric to use when comparing two or more models to determine which model best fits the data.
Additional Resources
How to Interpret Residual Standard Error
How to Perform Multiple Linear Regression in R
How to Perform Cross Validation for Model Performance in R
How to Calculate Standard Deviation in R