Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

print pandas dataframe diff to new column

I have a dataframe that looks like this. There are two rows for each id. These represent a game where the row with the highest points is the winner:

id   points
677    5
677    15
678    25
678    6

I would like to generate a new column ‘win’ in the dataframe so that the row with the same id with the higher points gets the value 1 and the lesser 0.

Like this:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

id   points  win
677    5      0
677    15     1
678    25     1
678    6      0

I think I could do something like this, but can’t figure out how you would get the diff to output a value based on the condition of greater or less and then push to a new column.

print(df.set_index('id').groupby(level=0).diff().query('points' > 0).index.unique().tolist())

>Solution :

Find the max points for each id and mark it as win:

df['win'] = (df.points.groupby(df['id']).transform('max') == df.points).astype(int)
df
    id  points  win
0  677       5    0
1  677      15    1
2  678      25    1
3  678       6    0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading