Home groupby on pandas dataframe created with pandas.read_csv

Questions

groupby on pandas dataframe created with pandas.read_csv

January 31, 2024

How do you create a dataframe from a csv file and be able to execute the same groupby command that simply works when creating the dataframe by hand?

Step-by-step i tried to do:

I create a pandas dataframe with multiindexed columns:

df = pd.DataFrame({
    ('A', 'x'): [1, 2, 3, 4],
    ('A', 'y'): [10, 15, 20, 25],
    ('B', 'x'): [7, 8, 9, 10],
    ('B', 'y'): [1, 1, 2, 2],
})

I then add a new column with some "metadata" that has only one column identifier:

df["C"] = ["F1", "F1", "F2", "F2"]

Results in:

   A       B      C
   x   y   x  y    
0  1  10   7  1  F1
1  2  15   8  1  F1
2  3  20   9  2  F2
3  4  25  10  2  F2

And i group my data on this column "C" – this works:

grouped = df.groupby("C").sum()

    A       B   
    x   y   x  y
C               
F1  3  25  15  2
F2  7  45  19  4

now i store the dataframe as csv file and read it again:

df.to_csv("test.csv")
df2 = pd.read_csv("test.csv", header=[0, 1], index_col=0)

   A       B                     C
   x   y   x  y Unnamed: 5_level_1
0  1  10   7  1                 F1
1  2  15   8  1                 F1
2  3  20   9  2                 F2
3  4  25  10  2                 F2

executing the same groupby command on the read-in dataframe results in an error:

df2.groupby("C").sum()

  File "xxx\py311\Lib\site-packages\pandas\core\groupby\grouper.py", line 980, in get_grouper
    raise ValueError(f"Grouper for '{name}' not 1-dimensional")
ValueError: Grouper for 'C' not 1-dimensional

I just cant seem to work out what i need to do to be able to groupby the column after reading in the data from csv?

>Solution :

Pandas ignores the trailing empty string levels^* during the groupby.

grouped.index
# Index(['F1', 'F2'], dtype='object', name='C') <- not a tuple

You can rename your "Unnamed" columns to empty string when importing the data to get the previous behavior:

df2 = (pd.read_csv("test.csv", header=[0, 1], index_col=0)
         .rename(columns=lambda x: '' if x.startswith('Unnamed:') else x)
       )

df2.groupby('C').sum()

Output:

    A       B   
    x   y   x  y
C               
F1  3  25  15  2
F2  7  45  19  4

^*_{This means that a ('C', '', 'X', '') name could be matched by ('C', '', 'X'), but not by 'C'.}

group-by

byMR

Published January 31, 2024

Add a comment

custom widget not refreshing on setState in flutter

byMR

January 31, 2024

Questions

Using python to generate json unexpectedly wrapping strings into lists

byMR

January 31, 2024

Questions

My .properties file isn't being picked up and my maven profile properties aren't injected

byMR

January 31, 2024

Questions

How to match fixed length string with quantifiers

byMR

January 31, 2024

Questions

NSViewRepresentable.updateNSView is not called unless the modified binding variable gets used inside

byMR

January 31, 2024

Questions

If/else condition is not working with Python

byMR

January 31, 2024

groupby on pandas dataframe created with pandas.read_csv

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

custom widget not refreshing on setState in flutter

Using python to generate json unexpectedly wrapping strings into lists

My .properties file isn't being picked up and my maven profile properties aren't injected

How to match fixed length string with quantifiers

NSViewRepresentable.updateNSView is not called unless the modified binding variable gets used inside

If/else condition is not working with Python

Keep Up to Date with the Most Important News

groupby on pandas dataframe created with pandas.read_csv

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

custom widget not refreshing on setState in flutter

Using python to generate json unexpectedly wrapping strings into lists

My .properties file isn't being picked up and my maven profile properties aren't injected

How to match fixed length string with quantifiers

NSViewRepresentable.updateNSView is not called unless the modified binding variable gets used inside

If/else condition is not working with Python

Discover more from Dev solutions