Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

how to delete NaN values in pandas while adding other columns' value to the very next row which has not NaN value?

Here is my dataframe view:

index   date    B   N   S   y_B   y_N   y_S  f_price    y_price  p_change   price_day_t-2
1   2021-01-12  1   29  0   2.0   57.0  0.0  10250.0    10760.0  -1.0        11060.0
2   2021-01-13  0   67  0   1.0   29.0  0.0  9810.0     10250.0  -1.0        10760.0
3   2021-01-14  2   19  0   0.0   67.0  0.0  NaN         NaN     NaN           NaN
4   2021-01-15  1   6   0   2.0   19.0  0.0  NaN         NaN     NaN           NaN
5   2021-01-16  2   46  0   1.0   6.0   0.0  9340.0     9810.0   -1.0        10250.0
6   2021-01-17  3   22  0   2.0   46.0  0.0  NaN         NaN     NaN           NaN
7   2021-01-18  1   34  0   3.0   22.0  0.0  8890.0     9340.0   -1.0        9810.0

I want to delete rows with null values but at the same time, I want to add the other columns’ values to the very next row without null values. which the result would be like this:

index   date    B   N   S   y_B   y_N   y_S  f_price    y_price  p_change   price_day_t-2
1   2021-01-12  1   29  0   2.0   57.0  0.0  10250.0    10760.0  -1.0        11060.0
2   2021-01-13  0   67  0   1.0   29.0  0.0  9810.0     10250.0  -1.0        10760.0
5   2021-01-16  5   71  0   3.0   92.0  0.0  9340.0     9810.0   -1.0        10250.0
7   2021-01-18  4   56  0   5.0   68.0  0.0  8890.0     9340.0   -1.0        9810.0

for example in row n.5 column "B", the result is 5 because of the aggregation 2+1+2. (the two other values [1] and [2] are from deleted rows)

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

How can I do this?

>Solution :

You can create groups by comapre last column from back for non missing values with cumulative sum, then aggregate sum for all columns without date, it aggregate by last values per groups:

g = df.iloc[::-1, -1].notna().cumsum()

d = dict.fromkeys(df.columns, 'sum')
d['date'] = 'last'

df = df.groupby(g, sort=False).agg(d).reset_index(drop=True)
print (df)
         date  B   N  S  y_B   y_N  y_S  f_price  y_price  p_change  \
0  2021-01-12  1  29  0  2.0  57.0  0.0  10250.0  10760.0      -1.0   
1  2021-01-13  0  67  0  1.0  29.0  0.0   9810.0  10250.0      -1.0   
2  2021-01-16  5  71  0  3.0  92.0  0.0   9340.0   9810.0      -1.0   
3  2021-01-18  4  56  0  5.0  68.0  0.0   8890.0   9340.0      -1.0   

   price_day_t-2  
0        11060.0  
1        10760.0  
2        10250.0  
3         9810.0  
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading