Pandas resample drops (static) datetime column, how do I keep it?

December 4, 2022

I’m working with a pandas Multiindex that is given by the three keys:
[Verbundzuordnung, ProjektIndex, Datum],

I would like to resample the dataframe on Datum hourly, which drops the right colum TagDesAbdichtens, I would like to keep it as it’s static.

            
Verbundzuordnung    ProjektIndex    Datum                           TagDesAbdichtens
1                   81679           2021-11-10 00:00:00+00:00       2021-12-08
                                    2021-11-10 00:00:00+00:00       2021-12-08
                                    2021-11-10 00:00:00+00:00       2021-12-08
                                    2021-11-10 00:00:00+00:00       2021-12-08
                                    2021-11-10 00:00:00+00:00       2021-12-08
...     ...     ...     ...
2                   94574           2022-02-28 23:00:00+00:00       2022-01-31
                                    2022-02-28 23:00:00+00:00       2022-01-31
                                    2022-02-28 23:00:00+00:00       2022-01-31
                                    2022-02-28 23:00:00+00:00       2022-01-31
                                    2022-02-28 23:00:00+00:00       2022-01-31

285192 rows × 1 columns

There are aditional columns that I left out here for easier comprehension.

I am currently applying this to resample the dataframe

all_merged = all_merged.groupby([
    pd.Grouper(level='Verbundzuordnung'), 
    pd.Grouper(level='ProjektIndex'), 
    pd.Grouper(level='Datum', freq='H')]
  )

all_merged.mean() gives me the wanted output with TagDesAbdichtens missing.
This value ist for each Verbundzuordnung and ProjektIndex unique and static and I would like to have it back in the resampled version.

Is there a way to do it with native pandas functions?

>Solution :

I’ve had success resampling using the native resample function. For example,

    resample_dict = {                                                                                                             
            'Verbundzuordnung': 'mean',                                                                                                    
            'ProjektIndex': 'mean',
            'TagDesAbdichtens': 'first'
    }

    data = data.resample("60T", closed='left', label='left').apply(resample_dict)

You can apply whichever grouping keys (in place of mean) to your columns (e.g. first, min, max, etc).

See https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.resample.html for more.