Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Fill missing dates hourly per group with previous value for only certain conditioned values in certain column

I have the following dataframe (sample):

import pandas as pd

data = [['A', '2022-09-01 10:00:00', False, 2], ['A', '2022-09-01 12:00:00', True, 3], ['A', '2022-09-01 14:00:00', False, 1], 
        ['B', '2022-09-01 13:00:00', False, 1], ['B', '2022-09-01 16:00:00', True, 4], ['B', '2022-09-01 18:00:00', False, 3]]
df = pd.DataFrame(data = data, columns = ['group', 'date', 'indicator', 'value'])

  group                 date  indicator  value
0     A  2022-09-01 10:00:00      False      2
1     A  2022-09-01 12:00:00       True      3
2     A  2022-09-01 14:00:00      False      1
3     B  2022-09-01 13:00:00      False      1
4     B  2022-09-01 16:00:00       True      4
5     B  2022-09-01 18:00:00      False      3

I would like to fill in the missing dates hourly per group where the value is the same as the previous existing row. When the previous existing row has an indicator of True, I would like to fill these missing rows with an indicator of False instead of True but the value is although the same. Here is the desired output:

    data = [['A', '2022-09-01 10:00:00', False, 2], ['A', '2022-09-01 11:00:00', False, 2], ['A', '2022-09-01 12:00:00', True, 3], ['A', '2022-09-01 13:00:00', False, 3], ['A', '2022-09-01 14:00:00', False, 1], 
            ['B', '2022-09-01 13:00:00', False, 1], ['B', '2022-09-01 14:00:00', False, 1], ['B', '2022-09-01 15:00:00', False, 1], ['B', '2022-09-01 16:00:00', True, 4], ['B', '2022-09-01 17:00:00', False, 4], ['B', '2022-09-01 18:00:00', False, 3]]
    df_desired = pd.DataFrame(data = data, columns = ['group', 'date', 'indicator', 'value'])

   group                 date  indicator  value
0      A  2022-09-01 10:00:00      False      2
1      A  2022-09-01 11:00:00      False      2
2      A  2022-09-01 12:00:00       True      3
3      A  2022-09-01 13:00:00      False      3
4      A  2022-09-01 14:00:00      False      1
5      B  2022-09-01 13:00:00      False      1
6      B  2022-09-01 14:00:00      False      1
7      B  2022-09-01 15:00:00      False      1
8      B  2022-09-01 16:00:00       True      4
9      B  2022-09-01 17:00:00      False      4
10     B  2022-09-01 18:00:00      False      3

As you can see the dates are filled hourly per group and the indicators became False when the previous indicator was True.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

So I was wondering if anyone knows how to fill these missing dates hourly per group with keeping care of when the indicator is True using pandas?

>Solution :

First create DatetimeIndex with DataFrame.set_index
and then in lambda function add missing hours by DataFrame.asfreq, last replace missing values by Series.fillna and forward filling missing values:

df['date'] = pd.to_datetime(df['date'])

df = (df.set_index('date')
        .groupby('group')[['indicator', 'value']]
        .apply(lambda x: x.asfreq('H'))
        .assign(indicator = lambda x: x['indicator'].fillna(False), 
                value = lambda x: x['value'].ffill())
        .reset_index())
print (df)
   group                date  indicator  value
0      A 2022-09-01 10:00:00      False    2.0
1      A 2022-09-01 11:00:00      False    2.0
2      A 2022-09-01 12:00:00       True    3.0
3      A 2022-09-01 13:00:00      False    3.0
4      A 2022-09-01 14:00:00      False    1.0
5      B 2022-09-01 13:00:00      False    1.0
6      B 2022-09-01 14:00:00      False    1.0
7      B 2022-09-01 15:00:00      False    1.0
8      B 2022-09-01 16:00:00       True    4.0
9      B 2022-09-01 17:00:00      False    4.0
10     B 2022-09-01 18:00:00      False    3.0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading