Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Keep most frequent value per group

I need to summarise some data.frame containing both numeric and character values. I need to sum the numerical features and to keep the most frequent character value per group at the same time. Ideally, it should be with dplyr but that’s not mandatory.

Here is a reproducible example :

have <- data.frame("id" = c(1,1,1,2,2,2),"a" = c("dog","dog","cat","duck","duck","horse"),"value" = c(1,2,3,4,5,6))
need <- data.frame("id" = c(1,2), "a" = c("dog","duck"), "value" = c(6,15))

Any idea ?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You could group_by and sum for total value and use table with which.max and names to select the character values that frequent the most like this:

library(dplyr)
have %>%
  group_by(id) %>%
  summarise(a = names(which.max(table(a))),
            value = sum(value))
#> # A tibble: 2 × 3
#>      id a     value
#>   <dbl> <chr> <dbl>
#> 1     1 dog       6
#> 2     2 duck     15

Created on 2023-03-10 with reprex v2.0.2

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading