In statistics, correlation refers to the strength and direction of a relationship between two variables. The value of a correlation coefficient can range from -1 to 1, with the following interpretations:
- -1: a perfect negative relationship between two variables
- 0: no relationship between two variables
- 1: a perfect positive relationship between two variables
One special type of correlation is called Spearman Rank Correlation, which is used to measure the correlation between two ranked variables. (e.g. rank of a student’s math exam score vs. rank of their science exam score in a class).
To calculate the Spearman rank correlation between two variables in R, we can use the following basic syntax:
corr test(x, y, method = 'spearman')
The following examples show how to use this function in practice.
Example 1: Spearman Rank Correlation Between Vectors
The following code shows how to calculate the Spearman rank correlation between two vectors in R:
#define data x #calculate Spearman rank correlation between x and y cor.test(x, y, method = 'spearman') Spearman's rank correlation rho data: x and y S = 234, p-value = 0.2324 alternative hypothesis: true rho is not equal to 0 sample estimates: rho -0.4181818
From the output we can see that the Spearman rank correlation is -0.41818 and the corresponding p-value is 0.2324.
This indicates that there is a negative correlation between the two vectors.
However, since the p-value of the correlation is not less than 0.05, the correlation is not statistically significant.
Example 2: Spearman Rank Correlation Between Columns in Data Frame
The following code shows how to calculate the Spearman rank correlation between two column in a data frame:
#define data frame df frame(team=c('A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'), points=c(67, 70, 75, 78, 73, 89, 84, 99, 90, 91), assists=c(22, 27, 30, 23, 25, 31, 38, 35, 34, 32)) #calculate Spearman rank correlation between x and y cor.test(df$points, df$assists, method = 'spearman') Spearman's rank correlation rho data: df$points and df$assists S = 36, p-value = 0.01165 alternative hypothesis: true rho is not equal to 0 sample estimates: rho 0.7818182
From the output we can see that the Spearman rank correlation is 0.7818 and the corresponding p-value is 0.01165.
This indicates that there is a strong positive correlation between the two vectors.
Since the p-value of the correlation is less than 0.05, the correlation is statistically significant.
Additional Resources
How to Calculate Partial Correlation in R
How to Calculate Autocorrelation in R
How to Calculate Rolling Correlation in R
How to Report Spearman’s Correlation in APA Format