Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to sum rows based on group and sub-group using dplyr in R?

My dataframe has 2 categorical variables, and one is of a lower hierarchy than the other. I want to sum the numerical value of all rows within the sub-category using dplyr.

Thank you in advance for all those who can help me with!

This is the dataframe I start with:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

transportation <- data.frame(
  Country = c("A", "A", "A", "B", "B", "B"),
  Mode = c("Car", "Train", "Plane", "Car", "Train", "Plane"),
  Energy = c(10000, 9000, 20000, 200000, 160000, 450000)
)

Starting Dataframe

And this is the dataframe I want to end up with:

country_sum <- data.frame(
  Country = c("A", "A", "A", "B", "B", "B"),
  Mode = c("Car", "Train", "Plane", "Car", "Train", "Plane"),
  Energy = c(10000, 9000, 20000, 200000, 160000, 450000),
  country_sum = c(39000, 39000, 39000, 810000, 810000, 810000)
)

Final Dataframe

>Solution :

First group by Country and then mutate with sum:

library(dplyr)

transportation %>% 
  group_by(Country) %>% 
  mutate(country_sum = sum(Energy))

 Country Mode  Energy country_sum
  <chr>   <chr>  <dbl>       <dbl>
1 A       Car    10000       39000
2 A       Train   9000       39000
3 A       Plane  20000       39000
4 B       Car   200000      810000
5 B       Train 160000      810000
6 B       Plane 450000      810000
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading