6 C
London
Tuesday, March 11, 2025
HomePythonDescriptive Statistics in PythonHow to Calculate Spearman Rank Correlation in Python

How to Calculate Spearman Rank Correlation in Python

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

In statistics, correlation refers to the strength and direction of a relationship between two variables. The value of a correlation coefficient can range from -1 to 1, with the following interpretations:

  • -1: a perfect negative relationship between two variables
  • 0: no relationship between two variables
  • 1: a perfect positive relationship between two variables

One special type of correlation is called Spearman Rank Correlation, which is used to measure the correlation between two ranked variables. (e.g. rank of a student’s math exam score vs. rank of their science exam score in a class).

This tutorial explains how to calculate the Spearman rank correlation between two variables in Python

Example: Spearman Rank Correlation in Python

Suppose we have the following pandas DataFrame that contains the math exam score and science exam score of 10 students in a particular class:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'student': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'],
                   'math': [70, 78, 90, 87, 84, 86, 91, 74, 83, 85],
                   'science': [90, 94, 79, 86, 84, 83, 88, 92, 76, 75]})

To calculate the Spearman Rank correlation between the math and science scores, we can use the spearmanr() function from scipy.stats:

from scipy.stats import spearmanr

#calculate Spearman Rank correlation and corresponding p-value
rho, p = spearmanr(df['math'], df['science'])

#print Spearman rank correlation and p-value
print(rho)

-0.41818181818181815

print(p)

0.22911284098281892

From the output we can see that the Spearman rank correlation is -0.41818 and the corresponding p-value is 0.22911.

This indicates that there is a negative correlation between the science and math exam scores.

However, since the p-value of the correlation is not less than 0.05, the correlation is not statistically significant.

Note that we could also use the following syntax to just extract the correlation coefficient or the p-value:

#extract Spearman Rank correlation coefficient
spearmanr(df['math'], df['science'])[0]

-0.41818181818181815

#extract p-value of Spearman Rank correlation coefficient
spearmanr(df['math'], df['science'])[1] 

0.22911284098281892

Additional Resources

How to Calculate Spearman Rank Correlation in R
How to Calculate Spearman Rank Correlation in Excel
How to Calculate Spearman Rank Correlation in Stata

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories