Home Conditional sum with mutate (dplyr)

Questions

Conditional sum with mutate (dplyr)

July 28, 2022

I want to achieve something similar to the excel sumif. It is quite easy if the condition is equal to something, but not for greater or smaller because then we can’t use the group_by() approach. For different reasons I can’t use arrange() and then using cumsum(). The data in the example is ordered, but it is not always ordered.

I am giving an small example, in reality I have more conditions:

data <- data.frame(a = c(2022, 2021, 2020), b = c(10, 20, 10))

get_sum  <- function(x, year){
  out <- filter(x, b <= year) %>%
    summarise(cum = sum(b)) %>% 
    pull()
  return(out)
}


data %>% 
  mutate(cum = get_sum(x = ., year = b))

Steps: for the first row mutate, it filters the whole data.frame by values equal or less than 2022, then summarize it by making the sum of the filtered data.frame column b: 10 + 20 + 10.
for the second row do the same and now filter the value of b equal or smaller than 2021 being the mutate output: 20 + 10, and so on.

The output should look something like this:

     a  b cum
1 2022 10  40
2 2021 20  30
3 2020 10  10

I would like to know why doesn’t work my code.

>Solution :

We could use map from purrr if we want to stay within tidyverse. We need map to input individual values of a into the function. In your small example, i.e.

library(dplyr)
library(purrr)

data |>
  mutate(cum = map(a, ~ sum(b[a <= .])))

Update. To make your own function work you could apply a similar logic using map, using cur_data() and fix a small typo within get_sum (a <= year rather than b <= year):

get_sum <- function(df, year) {

  df |>
    filter(a <= year) |>
    summarise(cum = sum(b)) |>
    pull()

}

data |>
  mutate(cum = map(a, ~ get_sum(df = cur_data(), year = .)))

Output:

     a  b cum
1 2022 10  40
2 2021 20  30
3 2020 10  10

dplyr

byMR

Published July 28, 2022

Add a comment

.join() gives back list as a result in python

byMR

July 28, 2022

Questions

Conditional forward fill values in Spark dataframe

byMR

July 28, 2022

Questions

Using an OR statement inside an if function

byMR

July 28, 2022

Questions

Adding an artificial primary key versus using a unique field

byMR

July 28, 2022

Questions

onClick(), I want to add table row

byMR

July 28, 2022

Questions

Could not find or load main class Sample.java

byMR

July 28, 2022

Conditional sum with mutate (dplyr)

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

.join() gives back list as a result in python

Conditional forward fill values in Spark dataframe

Using an OR statement inside an if function

Adding an artificial primary key versus using a unique field

onClick(), I want to add table row

Could not find or load main class Sample.java

Keep Up to Date with the Most Important News

Conditional sum with mutate (dplyr)

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

.join() gives back list as a result in python

Conditional forward fill values in Spark dataframe

Using an OR statement inside an if function

Adding an artificial primary key versus using a unique field

onClick(), I want to add table row

Could not find or load main class Sample.java

Discover more from Dev solutions