11.1 C
London
Sunday, July 7, 2024
HomeStatistics TutorialRHow to Create Categorical Variables in R (With Examples)

How to Create Categorical Variables in R (With Examples)

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

You can use the following syntax to create a categorical variable in R:

#create categorical variable from scratch
cat_variable A', 'B', 'C', 'D'))

#create categorical variable (with two possible values) from existing variable
cat_variable factor(ifelse(existing_variable #create categorical variable (with multiple possible values) from existing variable
cat_variable factor(ifelse(existing_variable A',
                          ifelse(existing_variable B', 
                          ifelse(existing_variable C', 
                          ifelse(existing_variable D',0)))))

The following examples show how to use this syntax in practice.

Example 1: Create a Categorical Variable from Scratch

The following code shows how to create a categorical variable from scratch:

#create data frame
df frame(var1=c(1, 3, 3, 4, 5),
                 var2=c(7, 7, 8, 3, 2),
                 var3=c(3, 3, 6, 10, 12),
                 var4=c(14, 16, 22, 19, 18))

#view data frame
df

  var1 var2 var3 var4
1    1    7    3   14
2    3    7    3   16
3    3    8    6   22
4    4    3   10   19
5    5    2   12   18

#add categorical variable named 'type' to data frame
df$type A', 'B', 'B', 'C', 'D'))

#view updated data frame
df

  var1 var2 var3 var4 type
1    1    7    3   14    A
2    3    7    3   16    B
3    3    8    6   22    B
4    4    3   10   19    C
5    5    2   12   18    D

Example 2: Create a Categorical Variable (with Two Values) from Existing Variable

The following code shows how to create a categorical variable from an existing variable in a data frame:

#create data frame
df frame(var1=c(1, 3, 3, 4, 5),
                 var2=c(7, 7, 8, 3, 2),
                 var3=c(3, 3, 6, 10, 12),
                 var4=c(14, 16, 22, 19, 18))

#view data frame
df

  var1 var2 var3 var4
1    1    7    3   14
2    3    7    3   16
3    3    8    6   22
4    4    3   10   19
5    5    2   12   18

#add categorical variable named 'type' using values from 'var4' column
df$type factor(ifelse(df$var1 #view updated data frame
df

  var1 var2 var3 var4 type
1    1    7    3   14    1
2    3    7    3   16    1
3    3    8    6   22    1
4    4    3   10   19    0
5    5    2   12   18    0

Using the ifelse() statement, we created a new categorical variable called “type” that takes the following values:

  • 1 if the value in the ‘var1’ column is less than 4.
  • 0 if the value in the ‘var1’ column is not less than 4.

Example 3: Create a Categorical Variable (with Multiple Values) from Existing Variable

The following code shows how to create a categorical variable (with multiple values) from an existing variable in a data frame:

#create data frame
df frame(var1=c(1, 3, 3, 4, 5),
                 var2=c(7, 7, 8, 3, 2),
                 var3=c(3, 3, 6, 10, 12),
                 var4=c(14, 16, 22, 19, 18))

#view data frame
df

  var1 var2 var3 var4
1    1    7    3   14
2    3    7    3   16
3    3    8    6   22
4    4    3   10   19
5    5    2   12   18

#add categorical variable named 'type' using values from 'var4' column
df$type factor(ifelse(df$var1 A',
                     ifelse(df$var1 B', 
                     ifelse(df$var1 C', 
                     ifelse(df$var1 D', 'E')))))

#view updated data frame
df

  var1 var2 var3 var4 type
1    1    7    3   14    A
2    3    7    3   16    B
3    3    8    6   22    B
4    4    3   10   19    C
5    5    2   12   18    D

Using the ifelse() statement, we created a new categorical variable called “type” that takes the following values:

  • A‘ if the value in the ‘var1’ column is less than 3.
  • Else, ‘B‘ if the value in the ‘var1’ column is less than 4.
  • Else, ‘C‘ if the value in the ‘var1’ column is less than 5.
  • Else, ‘D‘ if the value in the ‘var1’ column is less than 6.
  • Else, ‘E‘.

Additional Resources

How to Create Dummy Variables in R
How to Convert Factor to Character in R
How to Convert Character to Numeric in R

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories