Home getting the index of the max value in list and returning the values in subsequent lists

Questions

getting the index of the max value in list and returning the values in subsequent lists

June 20, 2022

I have a df

review                          review_id   word_count
['abc def', 'ghi']              [1, 2]      [2, 1]
['jkl mno pqr', 'stu', 'vwx']   [3, 4, 5]   [3, 1, 1]

For each row in the df, I would like to create new columns for review and review_id whereby there will only consist of the review/review_id with the maximum word count.
Thus, the output should look something like this.

review         review_id
abc def        1      
jkl mno pqr    3

I can only think of getting the index of the max value in word_count. Any sugeestions will be welcomed. Thanks!

>Solution :

Use DataFrame.explode with compare maximal values per groups (from rows) and select rows by boolean indexing if need all maximal values:

df = df.explode(['review','review_id','word_count'])
#for oldier pandas versions
#df = df.apply(lambda x: x.explode())

df = df[df['word_count'].eq(df.groupby(level=0)['word_count'].transform('max'))]
print (df)
        review review_id word_count
0      abc def         1          2
1  jkl mno pqr         3          3