If I think there are some problem data and I want to remove all of fruit that has <0 data, how can I do?
fruit year price
apple 2021 2
apple 2020 -9
apple 2019 3
banana 2021 9
banana 2020 7
banana 2019 5
orange 2021 7
orange 2020 2
orange 2019 -3
->
fruit year price
banana 2021 9
banana 2020 7
banana 2019 5
>Solution :
There are several possible solutions, here are three:
base R
dat[!dat$fruit %in% unique(dat[dat$price < 0, "fruit"]),]
dplyr
With all:
library(dplyr)
dat %>%
group_by(fruit) %>%
filter(all(price > 0))
Or, with any:
dat %>%
group_by(fruit) %>%
filter(!any(price < 0))
output
# A tibble: 3 x 3
# Groups: fruit [1]
fruit year price
<chr> <int> <int>
1 banana 2021 9
2 banana 2020 7
3 banana 2019 5