how to get smallest index in dataframe after using groupby

If create_date field does not correspond to period between from_date and to_date, I want to extract only the large index records using group by ‘indicator’ and record correspond to period between from_date to end_date.

from_date = '2022-01-01'
to_date = '2022-04-10'

   indicator    create_date
0      A         2022-01-03
1      B         2021-12-30
2      B         2021-07-11
3      C         2021-02-10
4      C         2021-09-08
5      C         2021-07-24
6      C         2021-01-30

Here is the result I want:

   indicator   create_date
0      A         2022-01-03
2      B         2021-07-11
6      C         2021-01-30

I’ve been looking for a solution for a long time, but I only found a way "How to get the index of smallest value", and I can’t find a way to compare the index number.

>Solution :

You can create helper column for maximal index values per indicator created by DataFrameGroupBy.idxmax, last select rows by DataFrame.loc:

df2 = df.loc[df.assign(tmp=df.index).groupby('indicator')['tmp'].idxmax()]
print (df2)
  indicator create_date
0         A  2022-01-03
2         B  2021-07-11
6         C  2021-01-30

Leave a Reply