11.1 C
London
Sunday, July 7, 2024
HomeRDescriptive Statistics in RHow to Calculate Quantiles by Group in R (With Examples)

How to Calculate Quantiles by Group in R (With Examples)

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

In statistics, quantiles are values that divide a ranked dataset into equal groups.

To calculate the quantiles grouped by a certain variable in R, we can use the following functions from the dplyr package in R:

library(dplyr)

#define quantiles of interest
q = c(.25, .5, .75)

#calculate quantiles by grouping variable
df %>%
  group_by(grouping_variable) %>%
  summarize(quant25 = quantile(numeric_variable, probs = q[1]), 
            quant50 = quantile(numeric_variable, probs = q[2]),
            quant75 = quantile(numeric_variable, probs = q[3]))

The following examples show how to use this syntax in practice.

Examples: Quantiles by Group in R

The following code shows how to calculate the quantiles for the number of wins grouped by team for a dataset in R:

library(dplyr)

#create data
df frame(team=c('A', 'A', 'A', 'A', 'A', 'A', 'A', 'A',
                        'B', 'B', 'B', 'B', 'B', 'B', 'B', 'B',
                        'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C'),
                 wins=c(2, 4, 4, 5, 7, 9, 13, 13, 15, 15, 14, 13,
                        11, 9, 9, 8, 8, 16, 19, 21, 24, 20, 19, 18))

#view first six rows of data
head(df)

  team wins
1    A    2
2    A    4
3    A    4
4    A    5
5    A    7
6    A    9

#define quantiles of interest
q = c(.25, .5, .75)

#calculate quantiles by grouping variable
df %>%
  group_by(team) %>%
  summarize(quant25 = quantile(wins, probs = q[1]), 
            quant50 = quantile(wins, probs = q[2]),
            quant75 = quantile(wins, probs = q[3]))

  team  quant25  quant50  quant75           
1 A         4         6     10  
2 B         9        12     14.2
3 C        17.5      19     20.2

Note that we can also specify any number of quantiles that we’d like:

#define quantiles of interest
q = c(.2, .4, .6, .8)

#calculate quantiles by grouping variable
df %>%
  group_by(team) %>%
  summarize(quant20 = quantile(wins, probs = q[1]), 
            quant40 = quantile(wins, probs = q[2]),
            quant60 = quantile(wins, probs = q[3]),
            quant80 = quantile(wins, probs = q[4]))

  team  quant20 quant40 quant60 quant80
              
1 A         4       4.8     7.4    11.4
2 B         9      10.6    13.2    14.6
3 C        16.8    18.8    19.2    20.6

We can also choose to calculate just one quantile by group. For example, here’s how to calculate the 90th percentile of the number of wins for each team:

#calculate 90th percentile of wins by team
df %>%
  group_by(team) %>%
  summarize(quant90 = quantile(wins, probs = 0.9))

   team   quant90
     
1  A        13  
2  B        15  
3  C        21.9

Additional Resources

How to Calculate Quartiles in R
How to Calculate Deciles in R
How to Calculate Percentiles in R

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories