numpy maximum reduce error for pandas series and int

np.maximum.reduce(lst) has different behavior from functools.reduce(np.maximum, lst) and also different behavior from np.maximum itself, when one of the elements in the list is a number (e.g. int) instead of an array/pandas Series.


on one hand,

import pandas as pd
import numpy as np

df = pd.DataFrame({'a': [1, 3, 2]})

np.maximum.reduce([df['a'], 2])

The last line gives the following error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().


on the other hand,

np.maximum(df['a'], 2)

yields the expected output

0    2
1    3
2    2
Name: a, dtype: int64


on a third hand,

reduce(np.maximum, [df['a'], 2])

also yields the expected output

0    2
1    3
2    2
Name: a, dtype: int64

Versions used

pandas version: 1.2.5
numpy version: 1.19.5
python: 3.7.9

>Solution :

See the docs for ufunc.reduce

.reduce(array, axis=0, dtype=None, out=None, keepdims=False, initial=<no value>, where=True)

Reduces array’s dimension by one, by applying ufunc along one axis.

[df['a'], 2] is not an array with a well-defined 0th axis. I’m not sure how this could work? The other operations are clear element-wise max operations which will operate on each argument after broadcasting against each other but numpy ufunc reduction operates on a single array.

Leave a Reply