Home iteratively calculating values in a variable taking into account previous calculated value

Questions

iteratively calculating values in a variable taking into account previous calculated value

December 11, 2023

The question arised while playing around with coding options for Advent of Code 2023 day 9. Let’s say we have a list col object with tibbles in one column. Each of these tibbles has a variable aa. The aim is a new variable bb which is the value of aa when all aa values below are zeroes. From then on, bb is to be calculated as: aa + lead(bb); i.e. summing up aa from the current row with bb one row below.

Here is a reproducible example:

dat <- tibble(
  V1 = sample(LETTERS, 3),
  V2 = rnorm(3, 100, 50),
  V3 = list(
    tibble(var_not_used = sample(LETTERS, 5), aa = c(15, 3, 0, 0 ,0)),
    tibble(var_not_used = sample(LETTERS, 5), aa = c(21, 6, 1, 0, 0)),
    tibble(var_not_used = sample(LETTERS, 5), aa = c(45, 15, 6, 2, 0))
  )
  )

bb in tibble 1 should be 18, 3, 0, 0, 0; bb in tibble 2 should be 28, 7, 1, 0, 0; bb in tibble 3 should be 68, 23, 8, 2, 0. Below is an attempt/illustration in which direction it should go.

How could we code elegant to get only one column bb? I am especially interested in tidyverse solutions; unnesting is an option, but would be nice, if the solution works directly on the nested object.

dat <- tibble(
  V1 = sample(LETTERS, 3),
  V2 = rnorm(3, 100, 50),
  V3 = list(
    tibble(var_not_used = sample(LETTERS, 5), aa = c(15, 3, 0, 0 ,0)),
    tibble(var_not_used = sample(LETTERS, 5), aa = c(21, 6, 1, 0, 0)),
    tibble(var_not_used = sample(LETTERS, 5), aa = c(45, 15, 6, 2, 0))
  )
  ) %>% 
  ## from here on: this returns the data that we are interested in; however,
  ## its a lot of code repetition and many new columns. Moreover, in reality
  ## we might face situations in which we had to do way more than 4 of these
  ## mutate's
  mutate(
    V3 = map(V3,  ~ .x %>% 
               mutate(bb1 = case_when(
                 lead(aa) == 0 ~ aa + lead(aa, default = 0)
                 ))
             )
    ) %>% 
  mutate(
    V3 = map(V3,  ~ .x %>% 
               mutate(bb2 = case_when(
                 is.na(bb1) ~ aa + lead(bb1, default = 0)
                 ))
             )
    )  %>% 
  mutate(
    V3 = map(V3,  ~ .x %>% 
               mutate(bb3 = case_when(
                 is.na(bb2) ~ aa + lead(bb2, default = 0)
                 ))
             )
    ) %>% 
  mutate(
    V3 = map(V3,  ~ .x %>% 
               mutate(bb4 = case_when(
                 is.na(bb3) ~ aa + lead(bb3, default = 0)
                 ))
             )
    )

>Solution :

It looks like bb is just the reverse cumulative sum of aa, so we can do:

dat %>%
  mutate(V3 = map(V3, ~ .x %>% mutate(bb = rev(cumsum(rev(aa))))))

We can see what this looks like by plucking out the list column:

dat %>%
  mutate(V3 = map(V3, ~ .x %>% mutate(bb = rev(cumsum(rev(aa)))))) %>%
  pluck('V3')
#> [[1]]
#> # A tibble: 5 x 3
#>   var_not_used    aa    bb
#>   <chr>        <dbl> <dbl>
#> 1 W               15    18
#> 2 K                3     3
#> 3 G                0     0
#> 4 Q                0     0
#> 5 F                0     0
#> 
#> [[2]]
#> # A tibble: 5 x 3
#>   var_not_used    aa    bb
#>   <chr>        <dbl> <dbl>
#> 1 P               21    28
#> 2 O                6     7
#> 3 A                1     1
#> 4 W                0     0
#> 5 M                0     0
#> 
#> [[3]]
#> # A tibble: 5 x 3
#>   var_not_used    aa    bb
#>   <chr>        <dbl> <dbl>
#> 1 M               45    68
#> 2 B               15    23
#> 3 Y                6     8
#> 4 A                2     2
#> 5 T                0     0