Keep most frequent value per group

I need to summarise some data.frame containing both numeric and character values. I need to sum the numerical features and to keep the most frequent character value per group at the same time. Ideally, it should be with dplyr but that’s not mandatory.

Here is a reproducible example :

have <- data.frame("id" = c(1,1,1,2,2,2),"a" = c("dog","dog","cat","duck","duck","horse"),"value" = c(1,2,3,4,5,6))
need <- data.frame("id" = c(1,2), "a" = c("dog","duck"), "value" = c(6,15))

Any idea ?

>Solution :

You could group_by and sum for total value and use table with which.max and names to select the character values that frequent the most like this:

library(dplyr)
have %>%
  group_by(id) %>%
  summarise(a = names(which.max(table(a))),
            value = sum(value))
#> # A tibble: 2 × 3
#>      id a     value
#>   <dbl> <chr> <dbl>
#> 1     1 dog       6
#> 2     2 duck     15

Created on 2023-03-10 with reprex v2.0.2

Leave a Reply