Dataframe column value based on aggregation of several columns

Say I have a pandas dataframe as below:

A   B   C
1   4   0.1
2   3   0.5
4   1   0.7
5   2   0.2
7   5   0.6

I want to loop through the rows in the dataframe, and for each row perform on aggregation on columns A and B as:

Agg = row[A] / row[A] + row[B]

A   B   C     Agg
1   4   0.1   0.2
2   3   0.5   0.4
4   1   0.7   0.8
5   2   0.2   0.7
7   5   0.6   0.6

For all values of Agg > 0.6, get their corresponding column C values into a list, i.e. 0.7 and 0.2 in this case.

Last step is to get the minimum of the list i.e. min(list) = 0.2 in this instance.

>Solution :

We could use vectorized operations: add for addition, rdiv for division (for A/(A+B)), gt for greater than comparison and loc for the filtering:

out = df.loc[df['A'].add(df['B']).rdiv(df['A']).gt(0.6), 'C'].min()

We could also derive the same result using query much more concisely:

out = df.query('A/(A+B)>0.6')['C'].min()

Output:

0.2

Leave a Reply