6.2 C
London
Thursday, December 19, 2024
HomeTidyverse in Rdplyr in Rdplyr: How to Change Factor Levels Using mutate()

dplyr: How to Change Factor Levels Using mutate()

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

You can use the following basic syntax in dplyr to change the levels of a factor variable by using the mutate() function:

library(dplyr)

df % mutate(team=recode(team,
                                'H' = 'Hawks',
                                'M' = 'Mavs',
                                'C' = 'Cavs'))

This particular syntax makes the following changes to the team variable in the data frame:

  • ‘H’ becomes ‘Hawks’
  • ‘M’ becomes ‘Mavs’
  • ‘C’ becomes ‘Cavs’

The following example shows how to use this syntax in practice.

Example: Change Factor Levels Using mutate()

Suppose we have the following data frame in R that contains information about various basketball players:

#create data frame
df frame(team=factor(c('H', 'H', 'M', 'M', 'C', 'C')),
                 points=c(22, 35, 19, 15, 29, 23))

#view data frame
df

  team points
1    H     22
2    H     35
3    M     19
4    M     15
5    C     29
6    C     23

We can use the following syntax with the mutate() function from the dplyr package to change the levels of the team variable:

library(dplyr)

#change factor levels of team variable
df % mutate(team=recode(team,
                                'H' = 'Hawks',
                                'M' = 'Mavs',
                                'C' = 'Cavs'))

#view updated data frame
df

   team points
1 Hawks     22
2 Hawks     35
3  Mavs     19
4  Mavs     15
5  Cavs     29
6  Cavs     23

Using this syntax, we were able to make the following changes to the team variable in the data frame:

  • ‘H’ becomes ‘Hawks’
  • ‘M’ becomes ‘Mavs’
  • ‘C’ becomes ‘Cavs’

We can verify that the factor levels have been changed by using the levels() function:

#display factor levels of team variable
levels(df$team)

[1] "Cavs"  "Hawks" "Mavs" 

Also note that you can choose to change just one factor level instead of all of them.

For example, we can use the following syntax to only change ‘H’ to ‘Hawks’ and leave the other factor levels unchanged:

library(dplyr)

#change one factor level of team variable
df % mutate(team=recode(team, 'H' = 'Hawks'))

#view updated data frame
df

   team points
1 Hawks     22
2 Hawks     35
3     M     19
4     M     15
5     C     29
6     C     23

Notice that ‘H’ has been changed to ‘Hawks’ but the other two factor levels remained unchanged.

Additional Resources

The following tutorials explain how to perform other common tasks in dplyr:

How to Remove Rows Using dplyr
How to Select Columns by Index Using dplyr
How to Filter Rows that Contain a Certain String Using dplyr

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories