Home Why grouping on columns (axis=1) is ineffective or at least faulty?

Questions

Why grouping on columns (axis=1) is ineffective or at least faulty?

August 1, 2023

My input is a pandas DataFrame :

   item foo_x foo_y bar_x bar_y
0     1     A     B     C     D
1     2     D     E     F     G
2     3     H     I     J     K
3     4     L     M     N     O

df = pd.DataFrame({'item': [1, 2, 3, 4],
 'foo_x': ['A', 'D', 'H', 'L'],
 'foo_y': ['B', 'E', 'I', 'M'],
 'bar_x': ['C', 'F', 'J', 'N'],
 'bar_y': ['D', 'G', 'K', 'O']})

I’m not asking too much to the groupby method, I only expect this standard aggregation :

   item       x       y
0     1  [A, C]  [B, D]
1     2  [D, F]  [E, G]
2     3  [H, J]  [I, K]
3     4  [L, N]  [M, O]

But my code below gives a nonsense error :

df_output = (
    df.rename(lambda x: x.split("_")[-1], axis=1)
        .groupby(level=0, axis=1).agg(list)
)

ValueError: Length of values (2) does not match length of index (4)

To be honest, this is absolutely counterintuitive based on how we’re used to apply groupby(..., axis=0).

Can you please explain the logic behind ?

>Solution :

The issue is that iterating over a DataFrame yields the column names:

list(pd.DataFrame({'A': [1, 2], 'B': [3, 4]}))
# ['A', 'B']

Using a small print hack to see what’s going on in our groupby:

(df.rename(lambda x: x.split("_")[-1], axis=1)
   .groupby(level=0, axis=1).agg(lambda x: print(list(x)))
)

Printed output:

['item']
['x', 'x']
['y', 'y']

To avoid that, you need to convert to numpy:

df_output = (
    df.rename(lambda x: x.split("_")[-1], axis=1)
      .groupby(level=0, axis=1).agg(lambda x: x.to_numpy().tolist())
)

Output:

  item       x       y
0  [1]  [A, C]  [B, D]
1  [2]  [D, F]  [E, G]
2  [3]  [H, J]  [I, K]
3  [4]  [L, N]  [M, O]

pandas

byMR

Published August 01, 2023

Add a comment

Regex for telegram-like username

byMR

August 1, 2023

Questions

System Text Json Converter for Null properties

byMR

August 1, 2023

Questions

"None" is being added at the bottom of the output

byMR

August 1, 2023

Questions

Use both operators in conditional render React

byMR

August 1, 2023

Questions

histogram with different label

byMR

August 1, 2023

Questions

How to properly position img with resize handlers

byMR

August 1, 2023

Why grouping on columns (axis=1) is ineffective or at least faulty?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Regex for telegram-like username

System Text Json Converter for Null properties

"None" is being added at the bottom of the output

Use both operators in conditional render React

histogram with different label

How to properly position img with resize handlers

Keep Up to Date with the Most Important News

Why grouping on columns (axis=1) is ineffective or at least faulty?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Regex for telegram-like username

System Text Json Converter for Null properties

"None" is being added at the bottom of the output

Use both operators in conditional render React

histogram with different label

How to properly position img with resize handlers

Discover more from Dev solutions