I’m working with a pandas Multiindex that is given by the three keys:
[Verbundzuordnung, ProjektIndex, Datum],
I would like to resample the dataframe on Datum hourly, which drops the right colum TagDesAbdichtens, I would like to keep it as it’s static.
Verbundzuordnung ProjektIndex Datum TagDesAbdichtens
1 81679 2021-11-10 00:00:00+00:00 2021-12-08
2021-11-10 00:00:00+00:00 2021-12-08
2021-11-10 00:00:00+00:00 2021-12-08
2021-11-10 00:00:00+00:00 2021-12-08
2021-11-10 00:00:00+00:00 2021-12-08
... ... ... ...
2 94574 2022-02-28 23:00:00+00:00 2022-01-31
2022-02-28 23:00:00+00:00 2022-01-31
2022-02-28 23:00:00+00:00 2022-01-31
2022-02-28 23:00:00+00:00 2022-01-31
2022-02-28 23:00:00+00:00 2022-01-31
285192 rows Ă— 1 columns
There are aditional columns that I left out here for easier comprehension.
I am currently applying this to resample the dataframe
all_merged = all_merged.groupby([
pd.Grouper(level='Verbundzuordnung'),
pd.Grouper(level='ProjektIndex'),
pd.Grouper(level='Datum', freq='H')]
)
all_merged.mean() gives me the wanted output with TagDesAbdichtens missing.
This value ist for each Verbundzuordnung and ProjektIndex unique and static and I would like to have it back in the resampled version.
Is there a way to do it with native pandas functions?
>Solution :
I’ve had success resampling using the native resample function. For example,
resample_dict = {
'Verbundzuordnung': 'mean',
'ProjektIndex': 'mean',
'TagDesAbdichtens': 'first'
}
data = data.resample("60T", closed='left', label='left').apply(resample_dict)
You can apply whichever grouping keys (in place of mean) to your columns (e.g. first, min, max, etc).
See https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.resample.html for more.