Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

enums in pandas dataframe, not possible to do groupby on a enumn column?

I just learned about enums and thought that they would fit something I’m coding. But when I run this code, I get an error. Am I trying to do something I shouldn’t be doing or is this a bug?

When trying to groupby a column with enums, I get this error:
TypeError: '<' not supported between instances of 'CarBrand' and 'CarBrand'

The code:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

import pandas as pd
from enum import Enum

class CarBrand(Enum):
    VOLVO = 'Volvo'
    BMW = 'BMW'

data = {
    'brand': [CarBrand.VOLVO,
              CarBrand.VOLVO, 
              CarBrand.BMW],
    'price': [35000, 
              37000, 
              45000]
}

df = pd.DataFrame(data)
sum_per_brand = df.groupby('brand').sum('price')
print(sum_per_brand)

This is the print I was expecting:
brand price
BMW 45000
VOLVO 72000

>Solution :

pd.DataFrame.groupby sorts by default. It works if you use sort=False:

sum_per_brand = df.groupby('brand', sort=False).sum('price')

Alternatively, you could use a datatype which supports sorting (like CategoricalDtype).

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading