10.3 C
London
Sunday, June 8, 2025
HomeStatistics TutorialRComparing grep() vs. grepl() in R: What’s the Difference?

Comparing grep() vs. grepl() in R: What’s the Difference?

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

Two functions that people often get mixed up in R are grep() and grepl(). Both functions allow you to see whether a certain pattern exists in a character string, but they return different results:

  • grepl() returns TRUE when a pattern exists in a character string.
  • grep() returns a vector of indices of the character strings that contain the pattern.

The following example illustrates this difference:

#create a vector of data
data grep('Guard', data)
[1] 1 2

grepl('Guard', data) 
[1]  TRUE  TRUE FALSE FALSE FALSE

The following examples show when you might want to use one of these functions over the other.

When to Use grepl()

1. Filter Rows that Contain a Certain String

One of the most common uses of grepl() is for filtering rows in a data frame that contain a certain string:

library(dplyr)

#create data frame
df #filter rows that contain the string 'Guard' in the player column
df %>% filter(grepl('Guard', player))

   player points rebounds
1 P Guard     12        5
2 S Guard     15        7

Related: How to Filter Rows that Contain a Certain String Using dplyr

When to Use grep()

1. Select Columns that Contain a Certain String

You can use grep() to select columns in a data frame that contain a certain string:

library(dplyr)

#create data frame
df #select columns that contain the string 'p' in their name
df %>% select(grep('p', colnames(df)))

     player points
1   P Guard     12
2   S Guard     15
3 S Forward     19
4 P Forward     22
5    Center     32

2. Count the Number of Rows that Contain a Certain String

You can use grep() to count the number of rows in a data frame that contain a certain string:

#create data frame
df #count how many rows contain the string 'Guard' in the player column
length(grep('Guard', df$player))

[1] 2

You can find more R tutorials here.

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories