Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Compare two observations and keep the higher Vale

I have a DF that looks like this;

ID      Station   Date       Value
111-001 Place1   2022-05-01  39.1
111-002 Place2   2022-05-01  0.01
222-001 Place1   2022-05-02  8.5
222-002 Place2   2022-05-02  40.3

I’d like to compare the similar IDs and keep the higher of the two. How is the best method to achieve this? I prefer using packages to base and any help is appreciated.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Use susbtr to group_by the 3 first numbers of ID, then use max:

library(dplyr)
dat %>% 
  group_by(gp = substr(ID, 1, 3)) %>% 
  mutate(max = max(Value))

# A tibble: 4 × 6
# Groups:   gp [2]
  ID      Station Date       Value gp      max
  <chr>   <chr>   <chr>      <dbl> <chr> <dbl>
1 111-001 Place1  2022-05-01 39.1  111    39.1
2 111-002 Place2  2022-05-01  0.01 111    39.1
3 222-001 Place1  2022-05-02  8.5  222    40.3
4 222-002 Place2  2022-05-02 40.3  222    40.3

In base R, use ave:

with(dat, ave(Value, substr(ID, 1, 3), FUN = max))
#[1] 39.1 39.1 40.3 40.3
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading