I have a python issue which I can’t seem to figure out. The following dataframe illustrates persons who are spending money in given years and months:
| ID | Year | Month | Amount |
|---|---|---|---|
| 134 | 2020 | 11 | -199 |
| 134 | 2020 | 12 | -50 |
| 134 | 2021 | 1 | 40 |
| 135 | 2020 | 11 | -365 |
| 135 | 2020 | 12 | -23 |
| 135 | 2021 | 1 | 400 |
However, I want to add a new column that sums these expenses each month.
Example: A person spends 50 euro’s in the 1st month and 100 in the 2nd. This will amount to 150 euro’s spent in total in month 2. I want to illustrate this spending behaviour in the new column with the assumption that everyone has 100 euro’s as starting capital. (This is why In the example below, a person spends 200 but the new column shows -100.
Here is the column that I would like:
| ID | Year | Month | Amount | NewColumn |
|---|---|---|---|---|
| 134 | 2020 | 11 | -200 | -100 |
| 134 | 2020 | 12 | -50 | -150 |
| 134 | 2021 | 1 | 40 | -110 |
| 135 | 2020 | 11 | -365 | -265 |
| 135 | 2020 | 12 | -23 | -288 |
| 135 | 2021 | 1 | 400 | 112 |
Here is the code to recreate the first dataframe:
data = {'ID': ['134', '134','134','135','135','135',], 'Year': [2020, 2020, 2021, 2020, 2020, 2021,], 'Month': [11, 12, 1, 11, 12, 1], 'Amount': [-199, -50, 40, -365, -23, 400]}
df = pd.DataFrame(data)
I’ve tried cumsum() and .rolling but can’t figure it out.
Hopefully I’ve given enough info and otherwise I’ll be happy to supply some more
Thanks in advance
>Solution :
You can use cumsum() + 100:
df["NewColumn"] = df.groupby(["ID"])["Amount"].cumsum()+100
Output:
ID Year Month Amount NewColumn
0 134 2020 11 -199 -99
1 134 2020 12 -50 -149
2 134 2021 1 40 -109
3 135 2020 11 -365 -265
4 135 2020 12 -23 -288
5 135 2021 1 400 112
PS: there is a small difference to your expected output, as you started with -200, but in the data it was -199 🙂