20.2 C
London
Sunday, June 22, 2025
HomeTidyverse in Rdplyr in RHow to Remove Rows Using dplyr (With Examples)

How to Remove Rows Using dplyr (With Examples)

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

You can use the following basic syntax to remove rows from a data frame in R using dplyr:

1. Remove any row with NA’s

df %>%
  na.omit()

2. Remove any row with NA’s in specific column

df %>%
  filter(!is.na(column_name))

3. Remove duplicates

df %>%
  distinct()

4. Remove rows by index position

df %>%
  filter(!row_number() %in% c(1, 2, 4))

5. Remove rows based on condition

df %>%
  filter(column1=='A' | column2 > 8)

The following examples show how to use each of these methods in practice with the following data frame:

library(dplyr)

#create data frame
df frame(team=c('A', 'A', 'B', 'B', 'C', 'C'),
                 points=c(4, NA, 7, 5, 9, 9),
                 assists=c(1, 3, 5, NA, 2, 2))

#view data frame
df

  team points assists
1    A      4       1
2    A     NA       3
3    B      7       5
4    B      5      NA
5    C      9       2
6    C      9       2

Example 1: Remove Any Row with NA’s

The following code shows how to remove any row with NA values from the data frame:

#remove any row with NA
df %>%
  na.omit()

  team points assists
1    A      4       1
3    B      7       5
5    C      9       2
6    C      9       2

Example 2: Remove Any Row with NA’s in Specific Columns

The following code shows how to remove any row with NA values in a specific column:

#remove any row with NA in 'points' column:
df %>%
  filter(!is.na(points))

  team points assists
1    A      4       1
2    B      7       5
3    B      5      NA
4    C      9       2
5    C      9       2

Example 3: Remove Duplicate Rows

The following code shows how to remove duplicate rows:

#remove duplicate rows
df %>%
  distinct()

  team points assists
1    A      4       1
2    A     NA       3
3    B      7       5
4    B      5      NA
5    C      9       2

Example 4: Remove Rows by Index Position

The following code shows how to remove rows based on index position:

#remove rows 1, 2, and 4
df %>%
  filter(!row_number() %in% c(1, 2, 4))

  team points assists
1    B      7       5
2    C      9       2
3    C      9       2

Example 5: Remove Rows Based on Condition

The following code shows how to remove rows based on specific conditions:

#only keep rows where team is equal to 'A' or points is greater than 8
df %>%
  filter(column1=='A' | column2 > 8)

  team points assists
1    A      4       1
2    A     NA       3
3    C      9       2
4    C      9       2

Additional Resources

The following tutorials explain how to perform other common functions in dplyr:

How to Select Columns by Index Using dplyr
How to Rank Variables by Group Using dplyr
How to Replace NA with Zero in dplyr

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories