Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

how to get smallest index in dataframe after using groupby

If create_date field does not correspond to period between from_date and to_date, I want to extract only the large index records using group by ‘indicator’ and record correspond to period between from_date to end_date.

from_date = '2022-01-01'
to_date = '2022-04-10'

   indicator    create_date
0      A         2022-01-03
1      B         2021-12-30
2      B         2021-07-11
3      C         2021-02-10
4      C         2021-09-08
5      C         2021-07-24
6      C         2021-01-30

Here is the result I want:

   indicator   create_date
0      A         2022-01-03
2      B         2021-07-11
6      C         2021-01-30

I’ve been looking for a solution for a long time, but I only found a way "How to get the index of smallest value", and I can’t find a way to compare the index number.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You can create helper column for maximal index values per indicator created by DataFrameGroupBy.idxmax, last select rows by DataFrame.loc:

df2 = df.loc[df.assign(tmp=df.index).groupby('indicator')['tmp'].idxmax()]
print (df2)
  indicator create_date
0         A  2022-01-03
2         B  2021-07-11
6         C  2021-01-30
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading