I have a dataframe with multiple days data in it (only one day shown here for brevity):
Charge
2022-01-03 13:19:02 99.5
2022-01-03 13:20:03 95.0
2022-01-03 13:21:02 64.2
2022-01-03 13:22:02 91.8
2022-01-03 13:23:02 99.5
I want to be able to find the row with the min and max value so that I can get the exact time of the min and max charge. If there are multiple, I will just select the first occurrence. i.e.:
Charge
2022-01-03 13:19:02 99.5
2022-01-03 13:21:02 64.2
I have tried using:
df_bat_chrg_min = df['Battery State of Charge'].groupby(df.index.day).min()
df_bat_chrg_max = df['Battery State of Charge'].groupby(df.index.day).max()
df_bat_chrg = pd.merge(df_bat_chrg_max, df_bat_chrg_min, left_index=True, right_index=True)
This generates:
Max Charge Min Charge
2022-01-03 100.0 96.5
The index name, however, doesn’t include the exact time of the event, as exemplified in the second code block.
>Solution :
Use DataFrameGroupBy.idxmax and
DataFrameGroupBy.idxmin for indices by minimal and maximal values, convert to Series and select original DatetimeIndex by DataFrame.loc:
df1 = (df.loc[df.groupby(df.index.day)['Charge']
.agg(['idxmin', 'idxmax']).stack()].sort_index())
print (df1)
Charge
2022-01-03 13:19:02 99.5
2022-01-03 13:21:02 64.2
If need aggregate new columns:
df2 = df.groupby(df.index.day)['Charge'].agg(['min','max', 'idxmin', 'idxmax'])
print (df2)
min max idxmin idxmax
3 64.2 99.5 2022-01-03 13:21:02 2022-01-03 13:19:02