Home Pandas – groupby ValueError: Cannot subset columns with a tuple with more than one element. Use a list instead

Questions

Pandas – groupby ValueError: Cannot subset columns with a tuple with more than one element. Use a list instead

May 2, 2023

I was updated my Pandas from I think it was 1.5.1 to 2.0.1. Any how I started getting an error on some code that works just fine before.

df = df.groupby(df['date'].dt.date)['Lake', 'Canyon'].mean().reset_index()

Traceback (most recent call last): File "f:…\My_python_file.py", line 37, in

df = df.groupby(df[‘date’].dt.date)[‘Lake’, ‘Canyon’].mean().reset_index() File
"C:\Users…\Local\Programs\Python\Python310\lib\site-packages\pandas\core\groupby\generic.py",
line 1767, in getitem
raise ValueError( ValueError: Cannot subset columns with a tuple with more than one element. Use a list instead.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.
Visit Medevel

>Solution :

Versions before Pandas < 2.0.0 raises a FutureWarning if you don’t use double brackets to select multiple columns

FutureWarning: Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead

From Pandas >= 2.0.0, it raises a ValueError:

ValueError: Cannot subset columns with a tuple with more than one element. Use a list instead.

For example:

# Pandas < 2.0.0
#             Missing [[ ... ]] --v              --v
>>> df.groupby(df['date'].dt.date)['Lake', 'Canyon'].mean().reset_index()
...
FutureWarning: Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.
  df.groupby(df['date'].dt.date)['Lake', 'Canyon'].mean().reset_index()

# Pandas >= 2.0.0
>>> df.groupby(df['date'].dt.date)['Lake', 'Canyon'].mean().reset_index()
...
ValueError: Cannot subset columns with a tuple with more than one element. Use a list instead.

Fix this using [[col1, col2, ...]]:

>>> df.groupby(df['date'].dt.date)[['Lake', 'Canyon']].mean().reset_index()
         date  Lake  Canyon
0  2023-05-02   1.5     3.5

Minimal Reproducible Example:

import pandas as pd

df = pd.DataFrame({'date': ['2023-05-02 12:34:56', '2023-05-02 12:32:12'], 
                   'Lake': [1, 2], 'Canyon': [3, 4]})
df['date'] = pd.to_datetime(df['date'])
print(df)

# Output
                 date  Lake  Canyon
0 2023-05-02 12:34:56     1       3
1 2023-05-02 12:32:12     2       4