Home Pandas – Conditionally finding max of row according to column value while maintaining index order

Questions

Pandas – Conditionally finding max of row according to column value while maintaining index order

November 21, 2022

I’m trying to find, hopefully, a one lines to accomplish the following:

I have the following dataframe:

import pandas as pd
import numpy as np

SIZE = 10
df = pd.DataFrame({'col1': np.random.randint(100, size=SIZE),
                   'col2': np.random.randint(100, size=SIZE),
                   'col3': np.random.randint(100, size=SIZE),
                   'col4': np.random.randint(2, size=SIZE)})
print(df)

outputting

   col1  col2  col3  col4
0    55    96    40     0
1    82    59    34     1
2    85    66    25     1
3    90    69    27     0
4    36    32    79     1
5    33    69    80     1
6    11    53    88     0
7    31    51    96     0
8    89    76    88     1
9     4    76    47     0

I’m currently ignoring col4 and calculating the max value of each row as follows:

df[['col1', 'col2', 'col3']].max(axis=1)

resulting in

0    96
1    82
2    85
3    90
4    79
5    80
6    88
7    96
8    89
9    76
dtype: int64

I want to use col4 to conditionally calculate the max value. If col4 value is 0, calculate max value of col1, else calculate max value of ['col2', 'col3']. I also want to keep the same index/order of the dataframe.

The end result would be

0    55    # col1
1    59    # max(col2, col3)
2    66    # max(col2, col3)
3    90    # col1
4    79    # max(col2, col3)
5    80    # max(col2, col3)
6    11    # col1
7    31    # col1
8    88    # max(col2, col3)
9    4     # col1
dtype: int64

One possibility would be to create two new dataframes, calculate the max, and join them again, but this would possibly mess the index (I guess I could save that too). Any better ideas?

Apologies if this question was already asked, but I couldn’t find with the search terms

>Solution :

There might be a better option… but this does the job by simply applying your rule as a lambda row-wise:

df.apply(lambda x: x[["col2", "col3"]].max() if x["col4"] else x["col1"], axis=1)

dataframe

byMR

Published November 21, 2022

Add a comment

Typescript/Javascript Call Method Reference with arguments

byMR

November 21, 2022

Questions

User selects the number of prompt boxes to display

byMR

November 21, 2022

Questions

How can I rewrite this query by not having AS?

byMR

November 21, 2022

Questions

Printing a variable with \n

byMR

November 21, 2022

Questions

find specific numbers in a sequence

byMR

November 21, 2022

Questions

Looping on a dictionary with Ansible

byMR

November 21, 2022

Pandas – Conditionally finding max of row according to column value while maintaining index order

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Typescript/Javascript Call Method Reference with arguments

User selects the number of prompt boxes to display

How can I rewrite this query by not having AS?

Printing a variable with \n

find specific numbers in a sequence

Looping on a dictionary with Ansible

Keep Up to Date with the Most Important News

Pandas – Conditionally finding max of row according to column value while maintaining index order

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Typescript/Javascript Call Method Reference with arguments

User selects the number of prompt boxes to display

How can I rewrite this query by not having AS?

Printing a variable with \n

find specific numbers in a sequence

Looping on a dictionary with Ansible

Discover more from Dev solutions