pandas using any(1) has suddenly started giving errors?

My code was working perfectly, I updated openpyxl now when I try:

data = {'Col1': ['Charges', 'Realized P&L', 'Other Credit & Debit', 'Some Other Value'],
        'Col2': [100, 200, 300, 400],
        'Col3': ['True', False, 'True', 'False']}
df = pd.DataFrame(data)

# keep rows where certain charges etc are present
filtered_df = df[df.isin(["Charges", "Realized P&L", "Other Credit & Debit"]).any(1)]

I get the error:

Traceback (most recent call last):
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/code.py", line 90, in runcode
exec(code, self.locals)
File "<input>", line 1, in <module>
TypeError: any() takes 1 positional argument but 2 were given

I tried:

filtered_df = df[df.isin(["Charges", "Realized P&L", "Other Credit & Debit"]).any()]
# removed 1 from any
UserWarning: Boolean Series key will be reindexed to match DataFrame index.
Traceback (most recent call last):
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/code.py", line 90, in runcode
    exec(code, self.locals)
  File "<input>", line 1, in <module>
  File "/Volumes/coding/venv/lib/python3.8/site-packages/pandas/core/frame.py", line 3751, in __getitem__
    return self._getitem_bool_array(key)
  File "/Volumes/coding/venv/lib/python3.8/site-packages/pandas/core/frame.py", line 3804, in _getitem_bool_array
    key = check_bool_indexer(self.index, key)
  File "/Volumes/coding/venv/lib/python3.8/site-packages/pandas/core/indexing.py", line 2499, in check_bool_indexer
    raise IndexingError(
pandas.errors.IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match).

Not sure what happened or why it suddenly stopped working.

>Solution :

I would argue that more correct filtering approach would be:

filtered_df = df[df['Col1'].isin(["Charges", "Realized P&L", "Other Credit & Debit"])]

Or specify the axis parameter explicitly:

filtered_df = df[df.isin(["Charges", "Realized P&L", "Other Credit & Debit"]).any(axis=1)]

Leave a Reply