Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas Dataframe: groupby id to find max column value and return corresponding value of another column

I have a large dataframe with different food entries. Each food has one nutrient (A, B, C, D) with a corresponding value for that nutrient in another column.
I want to define a function which takes a specific nutrient as an argument and returns the name of the food with the highest nutrient value. If the argument does not exist, it should return ‘Sorry, {requested nutrient} not found’.

df = pd.DataFrame([[0.99, 0.87, 0.58, 0.66, 0.62, 0.81, 0.63, 0.71, 0.77, 0.73, 0.69, 0.61, 0.92, 0.49],
               list('DAABBBBABCBDDD'),
               ['apple', 'banana', 'kiwi', 'lemon', 'grape', 'cheese', 'eggs', 'spam', 'fish', 'bread',
                'salad', 'milk', 'soda', 'juice'],
               ['***', '**', '****', '*', '***', '*', '**', '***', '*', '*', '****', '**', '**', '****']]).T
df.columns = ['value', 'nutrient', 'food', 'price']

I have tried the following:

def food_for_nutrient(lookup_nutrient, dataframe=df):
    max_values = dataframe.groupby(['nutrient'])['value'].max()
    result = max_values[lookup_nutrient]
    return print(result)

It seems to identify the max values of the nutrients correctly but it returns only the nutrient value. I need the corresponding str from column food.
For instance, if I give the following argument

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

food_for_nutrient('A‘)

My desired output is:

banana

My second problem is that my if statement doesn’t work. It always returns else

def food_for_nutrient(lookup_nutrient, dataframe=df):
    max_values = dataframe.groupby(['nutrient'])['value'].max()
    if lookup_nutrient in dataframe['nutrient']:
        result = max_values[lookup_nutrient]
        return print(result)
    else:
        return print(f'Sorry, {lookup_nutrient} not found.')

food_for_nutrient('A')

Thanks a lot for your help!

>Solution :

Try this:

def food_for_nutrient(lookup_nutrient):
    try:
        return df[df['nutrient'] == lookup_nutrient].set_index('food')['value'].astype(float).idxmax()
    except ValueError:
        return f'Sorry, {lookup_nutrient} not found.'

Output:

>>> food_for_nutrient('A')
'banana'

>>> food_for_nutrient('B')
'cheese'

>>> food_for_nutrient('C')
'bread'

>>> food_for_nutrient('D')
'apple'

>>> food_for_nutrient('E')
'Sorry, E not found.'
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading