Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How do I create a new column of max values of a column(corresponding to specific name) using pandas?

I’m wondering if it is possible to use Pandas to create a new column for the max values of a column (corresponding to different names, so that each name will have a max value).

For an example:

name    value    max
Alice    1        9
Linda    1        1
Ben      3        5
Alice    4        9
Alice    9        9
Ben      5        5
Linda    1        1

So for Alice, we are picking the max of 1, 4, and 9, which is 9. For Linda max(1,1) = 1, and for Ben max(3,5) = 5.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I was thinking of using .loc to select the name == "Alice", then get the max value of these rows, then create the new column. But since I’m dealing with a large dataset, this does not seem like a good option. Is there a smarter way to do this so that I don’t need to know what specific names?

>Solution :

groupby and taking a max gives the max by name, which is then merged with the original df

df.merge(df.groupby(['name'])['value'].max().reset_index(), 
         on='name').rename(
                    columns={'value_x' : 'value', 
                             'value_y' : 'max'})
    name    value   max
0   Alice   1   9
1   Alice   4   9
2   Alice   9   9
3   Linda   1   1
4   Linda   1   1
5   Ben     3   5
6   Ben     5   5
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading