Here is my dataframe view:
index date B N S y_B y_N y_S f_price y_price p_change price_day_t-2
1 2021-01-12 1 29 0 2.0 57.0 0.0 10250.0 10760.0 -1.0 11060.0
2 2021-01-13 0 67 0 1.0 29.0 0.0 9810.0 10250.0 -1.0 10760.0
3 2021-01-14 2 19 0 0.0 67.0 0.0 NaN NaN NaN NaN
4 2021-01-15 1 6 0 2.0 19.0 0.0 NaN NaN NaN NaN
5 2021-01-16 2 46 0 1.0 6.0 0.0 9340.0 9810.0 -1.0 10250.0
6 2021-01-17 3 22 0 2.0 46.0 0.0 NaN NaN NaN NaN
7 2021-01-18 1 34 0 3.0 22.0 0.0 8890.0 9340.0 -1.0 9810.0
I want to delete rows with null values but at the same time, I want to add the other columns’ values to the very next row without null values. which the result would be like this:
index date B N S y_B y_N y_S f_price y_price p_change price_day_t-2
1 2021-01-12 1 29 0 2.0 57.0 0.0 10250.0 10760.0 -1.0 11060.0
2 2021-01-13 0 67 0 1.0 29.0 0.0 9810.0 10250.0 -1.0 10760.0
5 2021-01-16 5 71 0 3.0 92.0 0.0 9340.0 9810.0 -1.0 10250.0
7 2021-01-18 4 56 0 5.0 68.0 0.0 8890.0 9340.0 -1.0 9810.0
for example in row n.5 column "B", the result is 5 because of the aggregation 2+1+2. (the two other values [1] and [2] are from deleted rows)
How can I do this?
>Solution :
You can create groups by comapre last column from back for non missing values with cumulative sum, then aggregate sum for all columns without date, it aggregate by last values per groups:
g = df.iloc[::-1, -1].notna().cumsum()
d = dict.fromkeys(df.columns, 'sum')
d['date'] = 'last'
df = df.groupby(g, sort=False).agg(d).reset_index(drop=True)
print (df)
date B N S y_B y_N y_S f_price y_price p_change \
0 2021-01-12 1 29 0 2.0 57.0 0.0 10250.0 10760.0 -1.0
1 2021-01-13 0 67 0 1.0 29.0 0.0 9810.0 10250.0 -1.0
2 2021-01-16 5 71 0 3.0 92.0 0.0 9340.0 9810.0 -1.0
3 2021-01-18 4 56 0 5.0 68.0 0.0 8890.0 9340.0 -1.0
price_day_t-2
0 11060.0
1 10760.0
2 10250.0
3 9810.0