Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas fillna() rows in a specific order

I have problem with fillna() method. This is my example df, which represents quantity of items in a shop. I would like to fill all the NaNs. If there is a NaN, I would like to fill it with values from previous day or if it is NaN, then from the next day (the same product, the same shop). If all days for specific product and shop are NaNs, then I would like to fill it with 0. I am looking for the best pandas way to do it, I had some ideas with loops, but it doesn’t look very well.

My df:

    day    shop  product  quantity
0     1  shop_A   apples       3.0
1     2  shop_A   apples       NaN
2     3  shop_A   apples       1.0
3     1  shop_A  bananas       NaN
4     2  shop_A  bananas       NaN
5     3  shop_A  bananas       NaN
6     1  shop_B   apples       NaN
7     2  shop_B   apples       NaN
8     3  shop_B   apples       2.0
9     1  shop_B  bananas       NaN
10    2  shop_B  bananas       4.0
11    3  shop_B  bananas       2.0

Expected df:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

    day    shop  product  quantity
0     1  shop_A   apples       3.0
1     2  shop_A   apples       3.0
2     3  shop_A   apples       1.0
3     1  shop_A  bananas       0.0
4     2  shop_A  bananas       0.0
5     3  shop_A  bananas       0.0
6     1  shop_B   apples       2.0
7     2  shop_B   apples       2.0
8     3  shop_B   apples       2.0
9     1  shop_B  bananas       4.0
10    2  shop_B  bananas       4.0
11    3  shop_B  bananas       2.0

I also tried fillna(limit=3), but this isn’t what am I looking for.

>Solution :

You can sort by day using sort_values and then perform a grouped bfill and then what’s left will just get a 0 with chaining a fillna(0):

df['quantity'] = df.sort_values(by='day')\
                 .groupby(['shop','product'])['quantity'].bfill(limit=3).fillna(0)

prints back:

    day    shop  product  quantity
0     1  shop_A   apples       3.0
1     2  shop_A   apples       1.0
2     3  shop_A   apples       1.0
3     1  shop_A  bananas       0.0
4     2  shop_A  bananas       0.0
5     3  shop_A  bananas       0.0
6     1  shop_B   apples       2.0
7     2  shop_B   apples       2.0
8     3  shop_B   apples       2.0
9     1  shop_B  bananas       4.0
10    2  shop_B  bananas       4.0
11    3  shop_B  bananas       2.0

This will give a the value of the next day to the NaN value of the previous day for each shop and product. You can similarly use ffill (or both), and perhaps linear interpolation, and your results will change accordingly. This is however what you need to get you started.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading