Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

filling in the missing times using average for the values pythn

I have this dataframe with some time missing (I want it to be every minute). Please see the sample below:

time = np.array([pd.to_datetime("2022-01-01 00:00:00"),pd.to_datetime("2022-01-01 00:00:01"),pd.to_datetime("2022-01-01 00:00:03"), pd.to_datetime("2022-01-01 00:00:04"),pd.to_datetime("2022-01-01 00:00:07"),pd.to_datetime("2022-01-01 00:00:09"), pd.to_datetime("2022-01-01 00:00:10")])
lat = [58.1, 58.4, 58.5, 58.9, 59,59.2, 59.5]
lng = [1.34, 1.44, 1.46, 1.48, 1.55, 1.57, 1.59]

df =  pd.DataFrame({"time": time, "lat": lat, "lng" :lng})


time                lat     lng
2022-01-01 00:00:00 58.1    1.34
2022-01-01 00:00:01 58.4    1.44
2022-01-01 00:00:03 58.5    1.46
2022-01-01 00:00:04 58.9    1.48
2022-01-01 00:00:07 59.0    1.55
2022-01-01 00:00:09 59.2    1.57
2022-01-01 00:00:10 59.5    1.59

I want to fill out the gaps in time so there is data for every minute annd the lat/lng is filled in with an average of the values in between. my plan was to create an array of time for each minute and try using ffil or something similar to fill out the missing points. But I cannot figure out how. The expected out put would be this

time                lat     lng
2022-01-01 00:00:00 58.1    1.34
2022-01-01 00:00:01 58.4    1.44
2022-01-01 00:00:01 58.45   1.45
2022-01-01 00:00:03 58.5    1.46
2022-01-01 00:00:04 58.9    1.48
2022-01-01 00:00:05 58.933  1.5033
2022-01-01 00:00:06 58.966  1.5233
2022-01-01 00:00:07 59.0    1.55
2022-01-01 00:00:08 59.1    1.56
2022-01-01 00:00:09 59.2    1.57
2022-01-01 00:00:10 59.5    1.59

Please give me some advice on how to do this

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Create DatetimeIndex then add missing times by div.DataFrame.asfreq and interpolate by DataFrame.interpolate:

df = df.set_index('time').asfreq(freq='S').interpolate()
print (df)
                           lat       lng
time                                    
2022-01-01 00:00:00  58.100000  1.340000
2022-01-01 00:00:01  58.400000  1.440000
2022-01-01 00:00:02  58.450000  1.450000
2022-01-01 00:00:03  58.500000  1.460000
2022-01-01 00:00:04  58.900000  1.480000
2022-01-01 00:00:05  58.933333  1.503333
2022-01-01 00:00:06  58.966667  1.526667
2022-01-01 00:00:07  59.000000  1.550000
2022-01-01 00:00:08  59.100000  1.560000
2022-01-01 00:00:09  59.200000  1.570000
2022-01-01 00:00:10  59.500000  1.590000
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading