How to Perform Label Encoding in R (With Examples)

Related stories

Often in machine learning, we want to convert categorical variables into some type of numeric format that can be readily used by algorithms.

One way to do this is through label encoding, which assigns each categorical value an integer value based on alphabetical order.

For example, the following screenshot shows how to convert each unique value in a categorical variable called Team into an integer value based on alphabetical order:

There are two common ways to perform label encoding in R:

Method 1: Use Base R

df$my_var numeric(factor(df$my_var))

Method 2: Use CatEncoders Package

library(CatEncoders)

#define original categorical labels
labs = LabelEncoder.fit(df$my_var)

#convert labels to numeric values
df$team = transform(labs, df$my_var)

The following examples show how to use each method in practice.

Example 1: Label Encoding Using Base R

The following code shows how to use the factor() function from base R to convert a categorical variable called team into a numeric variable:

#create data frame
df frame(team=c('A', 'A', 'B', 'B', 'B', 'B', 'C', 'C'),
                 points=c(25, 12, 15, 14, 19, 23, 25, 29))

#view data frame
df

  team points
1    A     25
2    A     12
3    B     15
4    B     14
5    B     19
6    B     23
7    C     25
8    C     29

#perform label encoding on team variable
df$team numeric(factor(df$team))

#view updated data frame
df

  team points
1    1     25
2    1     12
3    2     15
4    2     14
5    2     19
6    2     23
7    3     25
8    3     29

Notice the new values in the team column:

“A” has become 1.
“B” has become 2.
“C” has become 3.

We have successfully converted the team column from a categorical variable into a numeric variable.

Example 2: Label Encoding Using CatEncoders Package

The following code shows how to use functions from the CatEncoders() package to convert a categorical variable called team into a numeric variable:

library(CatEncoders)

#create data frame
df frame(team=c('A', 'A', 'B', 'B', 'B', 'B', 'C', 'C'),
                 points=c(25, 12, 15, 14, 19, 23, 25, 29))

#define original categorical labels
labs = LabelEncoder.fit(df$team)

#convert labels to numeric values
df$team = transform(labs, df$team)

#view updated data frame
df

  team points
1    1     25
2    1     12
3    2     15
4    2     14
5    2     19
6    2     23
7    3     25
8    3     29

Once again, we have generated the following new values in the team column:

“A” has become 1.
“B” has become 2.
“C” has become 3.

This matches the results from the previous example.

Note that using this method, you can also use inverse.transform() to obtain the original values from the team column:

#display original team labels
inverse.transform(labs, df$team)

[1] "A" "A" "B" "B" "B" "B" "C" "C"

Additional Resources

The following tutorials explain how to perform other common tasks in R:

How to Create Categorical Variable from Continuous in R
How to Create Categorical Variables in R
How to Convert Categorical Variables to Numeric in R

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Highlights of the 2023 Union Budget: Announcements for 15 Key Sectors

Gold Prices May Rise as Import Duty on Gold raised by 5%

Relief to MSMEs as Mandatory GST Registration waived for online sellers

GST Council Meet Highlights, Full List of Items to get Costlier

Highlights of the 2023 Union Budget: Announcements for 15 Key Sectors

Gold Prices May Rise as Import Duty on Gold raised by 5%

Relief to MSMEs as Mandatory GST Registration waived for online sellers

GST Council Meet Highlights, Full List of Items to get Costlier

Learn About Opening an Automobile Repair Shop in India

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

Income Splitting in Canada for 2023

Can I Deduct Home Office Expenses on my Tax Return 2023?

Canadian Tax – Personal Tax Deadline 2022

Example 1: Label Encoding Using Base R

Example 2: Label Encoding Using CatEncoders Package

Additional Resources

Learn About Opening an Automobile Repair Shop in India

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

Income Splitting in Canada for 2023

Can I Deduct Home Office Expenses on my Tax Return 2023?

ABOUT US

Latest

Learn About Opening an Automobile Repair Shop in India

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

Income Splitting in Canada for 2023

Popular

How to Create a Stem-and-Leaf Plot in SPSS

How to Create a Correlation Matrix in SPSS

Excel: How to Use IF Function with Multiple Conditions

Sitemap