Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to fill nan values from a specific date range in a python time series?

I’m working with a time series that have the recorded the prices from a fish in the markets from a Brazilian city from 2013 to 2021, the original dataset has three columns, one with the cheapest values founded, another with the most expensive ones and finally other with the average price found in the day they collected the data. I’ve made three subsets to the corresponding column, the dates and indexated the date then doing some explanatory analysis I founded that some specific months from 2013 and 2014 are with nan values.

dfmin.loc['2013-4-1':'2013-7-31']
    min
date    
2013-04-01 12:00:00 16.0
2013-04-02 12:00:00 16.0
2013-05-22 12:00:00 NaN
2013-05-23 12:00:00 NaN
2013-05-24 12:00:00 NaN
2013-05-27 12:00:00 NaN
2013-05-28 12:00:00 NaN
2013-05-29 12:00:00 NaN
2013-05-30 12:00:00 NaN
2013-05-31 12:00:00 NaN
2013-06-03 12:00:00 NaN
2013-06-04 12:00:00 NaN
2013-06-05 12:00:00 NaN
2013-06-06 12:00:00 NaN
2013-06-07 12:00:00 NaN
2013-06-10 12:00:00 NaN
2013-06-11 12:00:00 NaN
2013-06-12 12:00:00 NaN
2013-06-13 12:00:00 NaN
2013-06-14 12:00:00 NaN
2013-06-17 12:00:00 NaN
2013-06-18 12:00:00 NaN
2013-06-19 12:00:00 15.8
2013-06-20 12:00:00 15.8
2013-06-21 12:00:00 15.8
​```

I want to fill these NaN values from the month 05 with the average value from the medium price from the month 04 and the month 06, how can I make it?

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

IIUC, you can use simple indexing:

# if needed, convert to datetime
#df.index = pd.to_datetime(df.index)

df.loc[df.index.month==5, 'min'] = df.loc[df.index.month.isin([4,6]), 'min'].mean()

or if you have non NaN for the 5th month:

mask = df.index.month==5
df.loc[mask, 'min'] = (df.loc[mask, 'min']
                         .fillna(df.loc[df.index.month.isin([4,6]), 'min'].mean())
                       )

output:

                       min
date                      
2013-04-01 12:00:00  16.00
2013-04-02 12:00:00  16.00
2013-05-22 12:00:00  15.88
2013-05-23 12:00:00  15.88
2013-05-24 12:00:00  15.88
2013-05-27 12:00:00  15.88
2013-05-28 12:00:00  15.88
2013-05-29 12:00:00  15.88
2013-05-30 12:00:00  15.88
2013-05-31 12:00:00  15.88
2013-06-03 12:00:00    NaN
2013-06-04 12:00:00    NaN
2013-06-05 12:00:00    NaN
2013-06-06 12:00:00    NaN
2013-06-07 12:00:00    NaN
2013-06-10 12:00:00    NaN
2013-06-11 12:00:00    NaN
2013-06-12 12:00:00    NaN
2013-06-13 12:00:00    NaN
2013-06-14 12:00:00    NaN
2013-06-17 12:00:00    NaN
2013-06-18 12:00:00    NaN
2013-06-19 12:00:00  15.80
2013-06-20 12:00:00  15.80
2013-06-21 12:00:00  15.80
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading