iterate through dataframe and keep max value of columns

I have a dataframe that looks like this:

datetime hr rmssd neutral happy sad angry
2012-09-18 13:17:00 61.0 0.061420 0.884570 0.076952 0.001144 0.017392
2012-09-18 13:18:00 64.0 0.049663 0.931965 0.031468 0.000371 0.023774

What I want is to be able to create a new column that on each row assigns the name of the column which holds the biggest value: ie: in the first column there would be the emotion column stating ‘neutral’.
I tried iterating through each row like that:

for i in range(0,len(df3)):
    df3['emotion']=df3[['neutral','happy','sad','angry']].max()

but my resulting dataframe had an extra column named ’emotion’ filled with NaN values. Note that I’ve deleted all of the NaN values from my df.

I also tried using iloc:

for i in range(0,len(df3)):
    df3['emotion']=df3.iloc[3:][i].max()

but zero luck there as well. Any ideas?

>Solution :

This should work without the for-loop. Based on this answer -> duplicate?

df3['emotion'] = df3[['neutral','happy','sad','angry']].idxmax(axis=1)

Leave a Reply