Apply function to each cell in a row, based on another cell

For example, I have the following:

. a b benchmark
0 1 2 1
1 1 5 3

and I would like to apply a condition in Pandas for each column as:

def f(x):
  if x > benchmark:
    # X being the values of a or b
    return x
  else:
    return 0

But I don’t know how to do that. If I did df.apply(f) I can’t access other cells in the row as x is just the value of the one cell.

I don’t want to create a new column either. I want to directly change the value of the cell as I compare it to benchmark, clearing or 0’ing the cells that that do not meet the benchmark.

Any insight?

>Solution :

You don’t need a function, instead use vectorial operations:

out = df.where(df.gt(df['benchmark'], axis=0), 0)

To change the values in place:

df[df.le(df['benchmark'], axis=0)] = 0

Output:

   a  b  benchmark
0  0  2          0
1  0  5          0

If you don’t want to affect benchmark:

m = df.le(df['benchmark'], axis=0)
m['benchmark'] = False

df[m] = 0

Output:

   a  b  benchmark
0  0  2          1
1  0  5          3

Leave a Reply