If create_date field does not correspond to period between from_date and to_date, I want to extract only the large index records using group by ‘indicator’ and record correspond to period between from_date to end_date.
from_date = '2022-01-01'
to_date = '2022-04-10'
indicator create_date
0 A 2022-01-03
1 B 2021-12-30
2 B 2021-07-11
3 C 2021-02-10
4 C 2021-09-08
5 C 2021-07-24
6 C 2021-01-30
Here is the result I want:
indicator create_date
0 A 2022-01-03
2 B 2021-07-11
6 C 2021-01-30
I’ve been looking for a solution for a long time, but I only found a way "How to get the index of smallest value", and I can’t find a way to compare the index number.
>Solution :
You can create helper column for maximal index values per indicator created by DataFrameGroupBy.idxmax, last select rows by DataFrame.loc:
df2 = df.loc[df.assign(tmp=df.index).groupby('indicator')['tmp'].idxmax()]
print (df2)
indicator create_date
0 A 2022-01-03
2 B 2021-07-11
6 C 2021-01-30