Lets say I have a dataframe where I have banking information. I have present value, a list of transactions, and I want to work backwards to calculate the balance over time.
Here is the dataframe:
value CN_running_balance
2020-08-07 -50.82 843.70
2020-08-06 893.77 NaN
2020-08-05 0.00 NaN
2020-08-04 -9.56 NaN
2020-08-03 -12.21 NaN
... ... ...
2020-05-14 1224.78 NaN
2020-05-13 0.00 NaN
2020-05-12 0.00 NaN
2020-05-11 -25.00 NaN
2020-05-10 -0.00 NaN
And I want to transform the running balance to use that rows value column for next row, by subtracting the value from the balance.
value CN_running_balance
2020-08-07 -50.82 843.70
2020-08-06 893.77 894.52
2020-08-05 0.00 0.75
2020-08-04 -9.56 etc
2020-08-03 -12.21 etc
... ... ...
2020-05-14 1224.78 etc
2020-05-13 0.00 etc
2020-05-12 0.00 etc
2020-05-11 -25.00 etc
2020-05-10 -0.00 etc
This has been pretty tricky for me so I would appreciate any suggestions on how to solve the problem!
>Solution :
I would suggest simply iterating over your dataframe as you have to forward fill the value. Here is one approach:
from io import StringIO
import pandas as pd
# Create your data frame
input_string = """value CN_running_balance
2020-08-07 -50.82 843.70
2020-08-06 893.77 NaN
2020-08-05 0.00 NaN
2020-08-04 -9.56 NaN
2020-08-03 -12.21 NaN
2020-05-14 1224.78 NaN
2020-05-13 0.00 NaN
2020-05-12 0.00 NaN
2020-05-11 -25.00 NaN
2020-05-10 -0.00 NaN"""
data = StringIO(input_string)
df = pd.read_csv(data, sep=' ', engine='python')
# Create iterable index
df = df.reset_index()
# Forward fill running balance
for i in range(1, len(df)):
df.loc[i, 'CN_running_balance'] = df.loc[i-1, 'CN_running_balance'] - df.loc[i-1, 'value']
# Reset original index
df = df.set_index('index')
print(df)
Output:
value CN_running_balance
index
2020-08-07 -50.82 843.70
2020-08-06 893.77 894.52
2020-08-05 0.00 0.75
2020-08-04 -9.56 0.75
2020-08-03 -12.21 10.31
2020-05-14 1224.78 22.52
2020-05-13 0.00 -1202.26
2020-05-12 0.00 -1202.26
2020-05-11 -25.00 -1202.26
2020-05-10 -0.00 -1177.26