You can use the following basic syntax to group by and filter data using the dplyr package in R:
df %>% group_by(team) %>% filter(any(points == 10))
This particular syntax groups a data frame by the column called team and filters for only the groups where at least one value in the points column is equal to 10.
The following example shows how to use this syntax in practice.
Example: Group By and Filter Data Using dplyr
Suppose we have the following data frame in R that contains information about various basketball players:
#create data frame
df frame(team=c('A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C'),
points=c(10, 15, 8, 4, 10, 10, 12, 12, 7))
#view data frame
df
team points
1 A 10
2 A 15
3 A 8
4 B 4
5 B 10
6 B 10
7 C 12
8 C 12
9 C 7
We can use the following code to group the data frame by the value in the team column and then filter out all groups that do not have at least one value in the points column equal to 10:
library(dplyr)
#group by team and filter out teams where no points value is equal to 10
df %>%
group_by(team) %>%
filter(any(points == 10))
# A tibble: 6 x 2
# Groups: team [2]
team points
1 A 10
2 A 15
3 A 8
4 B 4
5 B 10
6 B 10
Notice that all rows where the team is equal to “C” are filtered out because there is no value in the points column for team “C “equal to 10.
Note that this is just one example of a filter that we could apply.
For example, we could apply another filter where we filter for teams where at least one value in the points column is greater than 13:
library(dplyr)
#group by team and filter out teams where no points value is greater than 13
df %>%
group_by(team) %>%
filter(any(points > 13))
# A tibble: 3 x 2
# Groups: team [1]
team points
1 A 10
2 A 15
3 A 8
Notice that only the rows where the team is equal to “A” are kept since this is the only team with at least one points value greater than 13.
Note: You can find the complete documentation for the filter function in dplyr here.
Additional Resources
The following tutorials explain how to perform other common operations in dplyr:
How to Select the First Row by Group Using dplyr
How to Filter by Multiple Conditions Using dplyr
How to Filter Rows that Contain a Certain String Using dplyr