How to Select Unique Rows in a Data Frame in R

You can use the following methods to select unique rows from a data frame in R:

Method 1: Select Unique Rows Across All Columns

library(dplyr)

df %>% distinct()

Method 2: Select Unique Rows Based on One Column

library(dplyr)

df %>% distinct(column1, .keep_all=TRUE)

Method 3: Select Unique Rows Based on Multiple Columns

library(dplyr)

df %>% distinct(column1, column2, .keep_all=TRUE)

This tutorial explains how to use each method in practice with the following data frame:

#create data frame
df frame(team=c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'),
                 position=c('G', 'G', 'F', 'F', 'G', 'G', 'F', 'F'),
                 points=c(10, 10, 8, 14, 15, 15, 17, 17))

#view data frame
df

  team position points
1    A        G     10
2    A        G     10
3    A        F      8
4    A        F     14
5    B        G     15
6    B        G     15
7    B        F     17
8    B        F     17

Example 1: Select Unique Rows Across All Columns

The following code shows how to select rows that have unique values across all columns in the data frame:

library(dplyr)

#select rows with unique values across all columns
df %>% distinct()

  team position points
1    A        G     10
2    A        F      8
3    A        F     14
4    B        G     15
5    B        F     17

We can see that there are five unique rows in the data frame.

Note: When duplicate rows are encountered, only the first unique row is kept.

Example 2: Select Unique Rows Based on One Column

The following code shows how to select unique rows based on the team column only.

library(dplyr)

#select rows with unique values based on team column only
df %>% distinct(team, .keep_all=TRUE)

  team position points
1    A        G     10
2    B        G     15

Since there are only two unique values in the team column, only the rows with the first occurrence of each value are kept.

Note: The argument .keep_all=TRUE tells R to keep all other columns in the output.

Example 3: Select Unique Rows Based on Multiple Columns

The following code shows how to select unique rows based on the team and position columns only.

library(dplyr)

#select rows with unique values based on team and position columns only
df %>% distinct(team, position, .keep_all=TRUE)

  team position points
1    A        G     10
2    A        F      8
3    B        G     15
4    B        F     17

Four rows are returned, since there are four unique combinations of values across the team and position columns.

Additional Resources

The following tutorials explain how to perform other common tasks in R:

How to Filter for Unique Values Using dplyr
How to Filter by Multiple Conditions Using dplyr
How to Count Number of Occurrences in Columns in R

Highlights of the 2023 Union Budget: Announcements for 15 Key Sectors

Gold Prices May Rise as Import Duty on Gold raised by 5%

Relief to MSMEs as Mandatory GST Registration waived for online sellers

GST Council Meet Highlights, Full List of Items to get Costlier

Highlights of the 2023 Union Budget: Announcements for 15 Key Sectors

Gold Prices May Rise as Import Duty on Gold raised by 5%

Relief to MSMEs as Mandatory GST Registration waived for online sellers

GST Council Meet Highlights, Full List of Items to get Costlier

Learn About Opening an Automobile Repair Shop in India

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

Income Splitting in Canada for 2023

Can I Deduct Home Office Expenses on my Tax Return 2023?

Canadian Tax – Personal Tax Deadline 2022

Example 1: Select Unique Rows Across All Columns

Example 2: Select Unique Rows Based on One Column

Example 3: Select Unique Rows Based on Multiple Columns

Additional Resources

Learn About Opening an Automobile Repair Shop in India

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

Income Splitting in Canada for 2023

Can I Deduct Home Office Expenses on my Tax Return 2023?

ABOUT US

Latest

Learn About Opening an Automobile Repair Shop in India

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

Income Splitting in Canada for 2023

Popular

How to Create a Stem-and-Leaf Plot in SPSS

How to Create a Correlation Matrix in SPSS

How to Convert Date of Birth to Age in Excel (With Examples)

Sitemap