I have a matrix
x <- cbind(label = c(1,1,1,-1,-1,0,0,-1,1,1),
val = c(1,3,7,5,4,2,6,8,5,1))
x
label val
[1,] 1 1
[2,] 1 3
[3,] 1 7
[4,] -1 5
[5,] -1 4
[6,] 0 2
[7,] 0 6
[8,] -1 8
[9,] 1 5
[10,] 1 1
I want to calculate the cumulative sum of sequences of values in the val, but the sequence is determined by the label column.
That is:
label1,1,1 cumsumval1, 1+3, 1+4+7label-1,-1 cumsumval-5, -5 + -4- if the
labelis zero then we ignore it
and so on..
label can only have three values -1, 0, 1
Also I want to have a column with the last value from each cumulative sum
In the final result I would like to get a matrix like this
res_x <- cbind(x,
cum_val = c(1,4,11,-5,-9,0,0,-8,5,6),
last_cum_val = c(NA,NA,11,NA,-9,NA,NA,8,NA,6))
res_x
label val cum_val last_cum_val
[1,] 1 1 1 NA
[2,] 1 3 4 NA
[3,] 1 7 11 11
[4,] -1 5 -5 NA
[5,] -1 4 -9 -9
[6,] 0 2 0 NA
[7,] 0 6 0 NA
[8,] -1 8 -8 -8
[9,] 1 5 5 NA
[10,] 1 1 6 6
>Solution :
x |>
mutate(group = c(0, cumsum(diff(label) != 0))) |>
mutate(cum_val = sign(label) * cumsum(val), .by = group) |>
mutate(last_cum_val = ifelse(row_number() == n(), last(cum_val), NA), .by = group)
label val group cum_val last_cum_val
[1,] 1 1 0 1 NA
[2,] 1 3 0 4 NA
[3,] 1 7 0 11 11
[4,] -1 5 1 -5 NA
[5,] -1 4 1 -9 -9
[6,] 0 2 2 0 NA
[7,] 0 6 2 0 0
[8,] -1 8 3 -8 -8
[9,] 1 5 4 5 NA
[10,] 1 1 4 6 6