Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Filtering to keep max but to ignore ties in R

I am working in R….

I have some colleges which operate from multiple sites. We know how many students there are at each site.

data <- data.frame(provider_id_num = c("1", "1", "2", "3", "3"), 
                   postcode = c("S2 3EH", "S2 3ET",  "S2 34h", "S2 rty", "B1 2eh"),
                   number_students = c(1, 3, 5, 2, 2))
provider_id postcode number_students
1 S2 3EH 1
1 S2 3ET 3
2 S2 34h 5
3 S2 rty 2
3 B1 2eh 2

For each provider, I want to keep the row with the most number of students.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

However, if there is a tie, I don’t mind which row I keep, but I only want it to keep only one row.

Desired outcome:

provider_id postcode number_students
1 S2 3ET 3
2 S2 34h 5
3 S2 rty 2

OR:

provider_id postcode number_students
1 S2 3ET 3
2 S2 34h 5
3 B1 2eh 2

Does anyone have any thoughts?

>Solution :

slice_max sounds like it could be useful for you. Here is an example
The example uses slice_min, but with_ties is also available for slice_max.

# Use with_ties = FALSE to return exactly n matches
mtcars %>% slice_min(cyl, n = 1, with_ties = FALSE)
#>             mpg cyl disp hp drat   wt  qsec vs am gear carb
#> Datsun 710 22.8   4  108 93 3.85 2.32 18.61  1  1    4    1
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading