Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

(dataframe) remove all of something's data(e.g. abcdefg) if some data had problem(e.g. cdf)

If I think there are some problem data and I want to remove all of fruit that has <0 data, how can I do?

fruit year price
apple  2021    2
apple  2020   -9
apple  2019    3
banana 2021    9
banana 2020    7
banana 2019    5
orange 2021    7
orange 2020    2
orange 2019   -3

->

fruit year price
banana 2021    9
banana 2020    7
banana 2019    5

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

There are several possible solutions, here are three:

base R

dat[!dat$fruit %in% unique(dat[dat$price < 0, "fruit"]),]

dplyr

With all:

library(dplyr)
dat %>% 
  group_by(fruit) %>% 
  filter(all(price > 0))

Or, with any:

dat %>% 
  group_by(fruit) %>% 
  filter(!any(price < 0))

output

# A tibble: 3 x 3
# Groups:   fruit [1]
  fruit   year price
  <chr>  <int> <int>
1 banana  2021     9
2 banana  2020     7
3 banana  2019     5
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading