Say I have a pandas dataframe as below:
A B C
1 4 0.1
2 3 0.5
4 1 0.7
5 2 0.2
7 5 0.6
I want to loop through the rows in the dataframe, and for each row perform on aggregation on columns A and B as:
Agg = row[A] / row[A] + row[B]
A B C Agg
1 4 0.1 0.2
2 3 0.5 0.4
4 1 0.7 0.8
5 2 0.2 0.7
7 5 0.6 0.6
For all values of Agg > 0.6, get their corresponding column C values into a list, i.e. 0.7
and 0.2
in this case.
Last step is to get the minimum of the list i.e. min(list) = 0.2
in this instance.
>Solution :
We could use vectorized operations: add
for addition, rdiv
for division (for A/(A+B)), gt
for greater than comparison and loc
for the filtering:
out = df.loc[df['A'].add(df['B']).rdiv(df['A']).gt(0.6), 'C'].min()
We could also derive the same result using query
much more concisely:
out = df.query('A/(A+B)>0.6')['C'].min()
Output:
0.2