20 C
London
Wednesday, July 23, 2025
HomeTidyverse in Rdplyr in RCreate New Variables in R with mutate() and case_when()

Create New Variables in R with mutate() and case_when()

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

Often you may want to create a new variable in a data frame in R based on some condition. Fortunately this is easy to do using the mutate() and case_when() functions from the dplyr package.

This tutorial shows several examples of how to use these functions with the following data frame:

#create data frame
df #view data frame
df

  player position points rebounds
1      a        G     12        5
2      b        F     15        7
3      c        F     19        7
4      d        G     22       12
5      e        G     32       11

Example 1: Create New Variable Based on One Existing Variable

The following code shows how to create a new variable called ‘scorer’ based on the value in the points column:

library(dplyr)

#define new variable 'scorer' using mutate() and case_when()
df %>%
  mutate(scorer = case_when(points low',
                           points med',
                           points high'))

  player position points rebounds scorer
1      a        G     12        5    low
2      b        F     15        7    med
3      c        F     19        7    med
4      d        G     22       12    med
5      e        G     32       11   high

Example 2: Create New Variable Based on Several Existing Variables

The following code shows how to create a new variable called ‘type’ based on the value in the player and position column:

library(dplyr)

#define new variable 'type' using mutate() and case_when()
df %>%
  mutate(type = case_when(player == 'a' | player == 'b' ~ 'starter',
                            player == 'c' | player == 'd' ~ 'backup',
                            position == 'G' ~ 'reserve'))

  player position points rebounds    type
1      a        G     12        5 starter
2      b        F     15        7 starter
3      c        F     19        7  backup
4      d        G     22       12  backup
5      e        G     32       11 reserve

The following code shows how to create a new variable called ‘valueAdded’ based on the value in the points and rebounds columns:

library(dplyr)

#define new variable 'valueAdded' using mutate() and case_when()
df %>%
  mutate(valueAdded = case_when(points  5 ~ 4,
                                points  8 ~ 7,
                                points >=25 ~ 9))

  player position points rebounds valueAdded
1      a        G     12        5          2
2      b        F     15        7          4
3      c        F     19        7          6
4      d        G     22       12          7
5      e        G     32       11          9

Additional Resources

How to Rename Columns in R
How to Remove Columns in R
How to Filter Rows in R

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories