An intraclass correlation coefficient (ICC) is used to determine if items or subjects can be rated reliably by different raters.
The value of an ICC can range from 0 to 1, with 0 indicating no reliability among raters and 1 indicating perfect reliability.
The easiest way to calculate ICC in Python is to use the pingouin.intraclass_corr() function from the pingouin statistical package, which uses the following syntax:
pingouin.intraclass_corr(data, targets, raters, ratings)
where:
- data: The name of the dataframe
- targets: Name of column containing the “targets” (the things being rated)
- raters: Name of column containing the raters
- ratings: Name of column containing the ratings
This tutorial provides an example of how to use this function in practice.
Step 1: Install Pingouin
First, we must install Pingouin:
pip install pingouin
Step 2: Create the Data
Suppose four different judges were asked to rate the quality of six different college entrance exams. We can create the following dataframe to hold the ratings of the judges:
import pandas as pd
#create DataFrame
df = pd.DataFrame({'exam': [1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6,
1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6],
'judge': ['A', 'A', 'A', 'A', 'A', 'A',
'B', 'B', 'B', 'B', 'B', 'B',
'C', 'C', 'C', 'C', 'C', 'C',
'D', 'D', 'D', 'D', 'D', 'D'],
'rating': [1, 1, 3, 6, 6, 7, 2, 3, 8, 4, 5, 5,
0, 4, 1, 5, 5, 6, 1, 2, 3, 3, 6, 4]})
#view first five rows of DataFrame
df.head()
exam judge rating
0 1 A 1
1 2 A 1
2 3 A 3
3 4 A 6
4 5 A 6
Step 3: Calculate the Intraclass Correlation Coefficient
Next, we’ll use the following code to calculate the intraclass correlation coefficient:
import pingouin as pg icc = pg.intraclass_corr(data=df, targets='exam', raters='judge', ratings='rating') icc.set_index('Type') Description ICC F df1 df2 pval CI95% Type ICC1 Single raters absolute 0.505252 5.084916 5 18 0.004430 [0.11, 0.89] ICC2 Single random raters 0.503054 4.909385 5 15 0.007352 [0.1, 0.89] ICC3 Single fixed raters 0.494272 4.909385 5 15 0.007352 [0.09, 0.88] ICC1k Average raters absolute 0.803340 5.084916 5 18 0.004430 [0.33, 0.97] ICC2k Average random raters 0.801947 4.909385 5 15 0.007352 [0.31, 0.97] ICC3k Average fixed raters 0.796309 4.909385 5 15 0.007352 [0.27, 0.97]
This function returns the following results:
- Description: The type of ICC calculated
- ICC: The intraclass correlation coefficient (ICC)
- F: The F-value of the ICC
- df1, df2: The degrees of freedom associated with the F-value
- pval: The p-value associated with the F-value
- CI95%: The 95% confidence interval for the ICC
Notice that there are six different ICC’s calculated here. This is because there are multiple ways to calculate the ICC depending on the following assumptions:
- Model: One-Way Random Effects, Two-Way Random Effects, or Two-Way Mixed Effects
- Type of Relationship: Consistency or Absolute Agreement
- Unit: Single rater or the mean of raters
For a detailed explanation of these assumptions, please refer to this article.