6.2 C
London
Thursday, December 19, 2024
HomeStatistics TutorialRHow to Find Coefficient of Determination (R-Squared) in R

How to Find Coefficient of Determination (R-Squared) in R

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

The coefficient of determination (commonly denoted R2) is the proportion of the variance in the response variable that can be explained by the explanatory variables in a regression model.

This tutorial provides an example of how to find and interpret R2 in a regression model in R.

Related: What is a Good R-squared Value?

Example: Find & Interpret R-Squared in R

Suppose we have the following dataset that contains data for the number of hours studied, prep exams taken, and exam score received for 15 students:

#create data frame
df #view first six rows of data frame
head(df)

  hours prep_exams score
1     1          1    76
2     2          3    78
3     2          3    85
4     4          5    88
5     2          2    72
6     1          2    69

The following code shows how to fit a multiple linear regression model to this dataset and view the model output in R:

#fit regression model
model #view model summary
summary(model)

Call:
lm(formula = score ~ hours + prep_exams, data = df)

Residuals:
    Min      1Q  Median      3Q     Max 
-7.9896 -2.5514  0.3079  3.3370  7.0352 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  71.8078     3.5222  20.387 1.12e-10 ***
hours         5.0247     0.8964   5.606 0.000115 ***
prep_exams   -1.2975     0.9689  -1.339 0.205339    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 4.944 on 12 degrees of freedom
Multiple R-squared:  0.7237,	Adjusted R-squared:  0.6776 
F-statistic: 15.71 on 2 and 12 DF,  p-value: 0.0004454

The R-squared of the model (shown near the very bottom of the output) turns out to be 0.7237.

This means that 72.37% of the variation in the exam scores can be explained by the number of hours studied and the number of prep exams taken.

Note that you can also access this value by using the following syntax:

summary(model)$r.squared

[1] 0.7236545

How to Interpret the R-Squared Value

An R-squared value will always range between 0 and 1.

A value of 1 indicates that the explanatory variables can perfectly explain the variance in the response variable and a value of 0 indicates that the explanatory variables have no ability to explain the variance in the response variable.

In general, the larger the R-squared value of a regression model the better the explanatory variables are able to predict the value of the response variable.

Check out this article for details on how to determine whether or not a given R-squared value is considered “good” for a given regression model.

Related: How to Calculate Adjusted R-Squared in R

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories