11.1 C
London
Sunday, July 7, 2024
HomeStatistics TutorialRR: How to Use apply() Function on Specific Columns

R: How to Use apply() Function on Specific Columns

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

Often you may want to use the apply() function to apply a function to specific columns in a data frame in R.

However, the apply() function first forces all columns in a data frame to have the same object type before applying a function, which can sometimes have unintended consequences.

A better choice is the lapply() function, which uses the following basic syntax:

df[c('col1', 'col2')] col1', 'col2')], my_function)

This particular example applies the function my_function to only col1 and col2 in the data frame.

The following example shows how to use this syntax in practice.

Example: Apply Function to Specific Columns of Data Frame

Suppose we have the following data frame in R:

#create data frame
df frame(team=c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'),
                 points=c(19, 22, 15, NA, 14, 25, 25, 25),
                 rebounds=c(10, 6, 3, 7, 11, 13, 9, 12),
                 assists=c(4, 4, 3, 6, 7, 5, 10, 8))

#view data frame
df

  team points rebounds assists
1    A     19       10       4
2    A     22        6       4
3    A     15        3       3
4    A     NA        7       6
5    B     14       11       7
6    B     25       13       5
7    B     25        9      10
8    B     25       12       8

Now suppose we define the following function that multiplies values by 2 and then adds 1:

#define function
my_function function(x) x*2 + 1

We can use the following lapply() function to apply this function only to the points and rebounds columns in the data frame:

#apply function to specific columns
df[c('points', 'rebounds')] points', 'rebounds')], my_function)

#view updated data frame
df

  team points rebounds assists
1    A     39       21       4
2    A     45       13       4
3    A     31        7       3
4    A     NA       15       6
5    B     29       23       7
6    B     51       27       5
7    B     51       19      10
8    B     51       25       8

From the output we can see that we multiplied each value in the points and rebounds columns by 2 and then added 1.

Also notice that the team and assists columns remained unchanged.

Additional Resources

The following tutorials explain how to perform other common tasks in R:

A Guide to apply(), lapply(), sapply(), and tapply() in R
How to Use the transform Function in R

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories