how to ungroup and mutate results to the original dataset after summing up values, R

I want to combine two compound commands from package "dplyr" for simplicity.

this is a hypothetical dataset

V5 V15 sum length density
upstream g1 1234 17645 0.1
upstream g2 3456 17645 0.3
downstream g1 2345 17645 0.2
downstream g2 1456 17645 0.1

I first get the total length of each region:

df %>% dplyr::group_by(V5) %>% 
  dplyr::summarize(sum(sum)) %>% 
  ungroup()

then manually add it to a new column and extra:

df= df %>% mutate("region" = case_when(
    str_detect(V5, "upstream") ~ "4690",
    str_detect(V5, "downstream") ~ "3801",
))

df$Gsize <- (as.numeric(df$region)/14675549)*100

the function ungroup() doesn’t do what I expected, I want the summed value be added for all variables. how can I combine the first and second functions in a way that it automatically calculates each region’s size, adds it to a new column so then I can get the percentage of it? it is tedious to be done manually for many regions and many tables.

expected result:

V5 V15 sum length density region
upstream g1 1234 17645 0.1 4690
upstream g2 3456 17645 0.3 4690
downstream g1 2345 17645 0.2 3801
downstream g2 1456 17645 0.1 3801

>Solution :

After computing the totals, join the totals with the original dataset. Then you can proceed with your percentage calculation.

library(dplyr)

 df %>%
  group_by(V5) %>% 
  summarize(total = sum(sum)) %>% 
  left_join(df, by = "V5")

Leave a Reply