Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How do you get minimum of a group and value corresponding to the minimum row of a column in pandas?

How do I get minimum date of a group in pandas and get corresponding value of a column value in different column using pandas?

data = {'date_value': ['2014-06-22', '2014-06-24', '2014-06-25', '2014-06-25', '2014-07-02', '2014-07-06', '2014-07-05', '2014-07-27'],
    'type': ['A', 'A', 'A', 'B', 'B', 'B', 'B', 'C'],
    'sum_col': [1, 2, 3, 5, 4, 3, 2, 1]}
df = pd.DataFrame(data, columns=['date_value', 'type', 'sum_col'])
df['date_value'] = pd.to_datetime(df['date_value'])
df[['min_date','new_sum_col']] = df.groupby('type')['date_value','sum_col'].transform('min')

The result I get is :

  date_value type  sum_col   min_date  new_sum_col
0 2014-06-22    A        1 2014-06-22            1
1 2014-06-24    A        2 2014-06-22            1
2 2014-06-25    A        3 2014-06-22            1
3 2014-06-25    B        5 2014-06-25            2
4 2014-07-02    B        4 2014-06-25            2
5 2014-07-06    B        3 2014-06-25            2
6 2014-07-05    B        2 2014-06-25            2
7 2014-07-27    C        1 2014-07-27            1

Result I want is :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

   date_value type  sum_col    min_date  new_sum_col
0  2014-06-22    A        1  2014-06-22            1
1  2014-06-24    A        2  2014-06-22            1
2  2014-06-25    A        3  2014-06-22            1
3  2014-06-25    B        5  2014-06-25            5
4  2014-07-02    B        4  2014-06-25            5
5  2014-07-06    B        3  2014-06-25            5
6  2014-07-05    B        2  2014-06-25            5
7  2014-07-27    C        1  2014-07-27            1

Any help would be appreciated.

Thank you,

Sam

>Solution :

You can use transform('first') after sorting by the date_value to get both column values for the row with the minimum date:

df[['min_date', 'new_sum_col']] = df.sort_values('date_value', ascending=True).groupby('type')['date_value', 'sum_col'].transform('first')
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading