Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Using cumsum for panel data in r

I have a unbalanced panel data from a dataframe of transactions. I want to create a variable balance that sums the previously made transactions and thus reflects the balance of the wallet in a given period.

Head 40 looks like this:

period uniqueid            ethamount       date           dollvalue ispurchase isunrealgain Freq day holdingtime
1       1        1                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
2       2        1                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
3       3        1                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
4       4        1                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
5       5        1                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
6       6        1                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
7       7        1     0.29486577590755 2017-12-10 -126.56818565055676          1            0    3 641         134
8       8        1  0.12565359813650237 2018-03-29 -48.266060116193295          1            0    3 750         134
9       8        1 -0.13903051539510708 2018-04-23   89.55372588145032          0            0    3 775         134
10      9        1                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
11     10        1                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
12      1        2                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
13      2        2                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
14      3        2                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
15      4        2                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
16      5        2                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
17      6        2                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
18      7        2                0.151 2018-01-24           -160.5432          1            0    4 686         196
19      7        2                0.093 2018-02-07           -70.04574          1            0    4 700         196
20      8        2                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
21      9        2          -0.22047692 2018-08-08   78.36631644479999          0            0    4 882         196
22     10        2                0.087 2018-09-25 -19.056479999999997          1            0    4 930         196
23      1        3                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
24      2        3                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
25      3        3                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
26      4        3                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
27      5        3                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
28      6        3                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
29      7        3                  2.4 2017-11-30 -1037.3039999999999          1            0    3 631          35
30      7        3                   -1 2018-01-04                 940          0            0    3 666          35
31      8        3                 0.92 2018-03-04           -796.2876          1            0    3 725          35
32      9        3                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
33     10        3                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
34      1        4                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
35      2        4                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
36      3        4                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
37      4        4                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
38      5        4                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
39      6        4                    0       <NA>                   0       <NA>         <NA> <NA>  NA        <NA>
40      7        4                 0.43 2017-11-05 -127.12089999999999          1            1    4 606         210

Simply using group_by and summarize does not work because it only outputs one value, I need however to create a variable balance that adds all the ethamount values from previous rows period by uniqueid in order to have the wallet balance for the respective period

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

What I am trying to do would look somethig like this, however that code does not work

paneldata2$balance <- paneldata2 %>% 
  group_by(uniqueid) %>%
  mutate(cumsum(paneldata2$ethamount)) %>%
  ungroup()

Error in `mutate_cols()`:
! Problem with `mutate()` input `..1`.
i `..1 = cumsum(paneldata2$ethamount)`.
i `..1` must be size 11 or 1, not 529476.
i The error occurred in group 1: uniqueid = 1.

Is there an alternative way of doing what I want or can this code easily be fixed?

Thanks in advance

>Solution :

You should write some functions differently, maybe you want this:

library(dplyr)
paneldata2 %>% 
  group_by(uniqueid) %>%
  mutate(balance = cumsum(ethamount)) %>%
  ungroup()

Output:

# A tibble: 40 × 11
   period uniqueid ethamount date       dollvalue ispurchase isunrealgain Freq    day holdingtime balance
    <int>    <int>     <dbl> <chr>          <dbl> <chr>      <chr>        <chr> <int> <chr>         <dbl>
 1      1        1     0     <NA>             0   <NA>       <NA>         <NA>     NA <NA>          0    
 2      2        1     0     <NA>             0   <NA>       <NA>         <NA>     NA <NA>          0    
 3      3        1     0     <NA>             0   <NA>       <NA>         <NA>     NA <NA>          0    
 4      4        1     0     <NA>             0   <NA>       <NA>         <NA>     NA <NA>          0    
 5      5        1     0     <NA>             0   <NA>       <NA>         <NA>     NA <NA>          0    
 6      6        1     0     <NA>             0   <NA>       <NA>         <NA>     NA <NA>          0    
 7      7        1     0.295 2017-12-10    -127.  1          0            3       641 134           0.295
 8      8        1     0.126 2018-03-29     -48.3 1          0            3       750 134           0.421
 9      8        1    -0.139 2018-04-23      89.6 0          0            3       775 134           0.281
10      9        1     0     <NA>             0   <NA>       <NA>         <NA>     NA <NA>          0.281
# … with 30 more rows
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading