Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Group rows based one column and sum up the rest of the columns

I try to group the rows of a data.frame by the ID column and sum up all the numerical columns and drop the string column. For that, I created a small data.frame df1:

   ID string 2018 2019 2020 2021
1: a1     x2    3    3    0    4
2: a2     g3    5    5    4    0
3: a2     n2   11    6   13    3
4: a1     m3    3   21    9    8
5: a2     2w    9    1   16    5
6: a1    ps2   22    4    7    4
7: a1    kg2    6    0    9    6

and I try to get the sum of the years like in df2:

   ID 2018 2019 2020 2021
1: a1   34   28   25   22
2: a2   25   12   33    8

I was trying it with the group_by and summarize function of dplyr, but I wasn’t able to get what I want.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

library(dplyr)
df1 %>%
  group_by(ID) %>%
  summarize(df$2018 = sum(2018))

Thanks a lot

>Solution :

You could use

library(data.table)

df1[, lapply(.SD[, `2018`:`2021`], sum), by = ID]

This returns

   ID 2018 2019 2020 2021
1: a1   34   28   25   22
2: a2   25   12   33    8
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading