Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Calculating multiple columns from one column with summarise

Here is an example of what I’m trying to achieve:

df <- data.frame(label = c(rep("ABC", 5), rep("CDE", 5), rep("FGH", 5)), x = runif(15, 0, 100))

df %>% group_by(label) %>%
  summarise(across(everything(), list(lessthan_10 = ~sum(. < 10), lessthan_20 = ~sum(. < 20), lessthan_30 = ~sum(. < 20), lessthan_40 = ~sum(. < 40))))

In this case, I’m calculating 4 different columns in the summary (counting the entries less than 10, less than 20, less than 30, and less than 40). In reality, I would like to calculate 100 different columns using a custom function that takes in x and 100 different parameters. Is there a way to do this using a loop or a list without writing out every single column I want to calculate?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You can use purrr::map_dfc:

library(tidyverse)
df %>% 
  group_by(label) %>% 
  summarise(map_dfc(seq(10, 40, 10), ~ tibble("x_lessthan_{.x}" := sum(x < .x))))
  label x_lessthan_10 x_lessthan_20 x_lessthan_30 x_lessthan_40
  <chr>         <int>         <int>         <int>         <int>
1 ABC               0             3             3             3
2 CDE               0             2             3             4
3 FGH               1             2             3             3
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading