Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How can I group a pandas dataframe based on a string hierarchy?

Given a dataframe like this:

Client Priority
Client 1 Low
Client 1 Medium
Client 2 Low
Client 2 High
Client 3 Low

I want to group by client and select the highest priority (Low, Medium, High) for each client. It should look something like this:

Client Priority
Client 1 Medium
Client 2 High
Client 3 Low

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

convert priority into a categorical column before grouping:

dtype = pd.CategoricalDtype(['High', 'Medium', 'Low'], ordered = True)

df['Priority'] = df.Priority.astype(dtype)

df.sort_values(['Client', 'Priority']).groupby('Client', sort = False).head(1)

     Client Priority
1  Client 1   Medium
3  Client 2     High
4  Client 3      Low

if you do not want to convert to category, you can map the positions instead:

(df.assign(temp = df.Priority.map({'Low':3,'Medium':2, 'High':1}))
   .sort_values(['Client', 'temp'])
   .drop(columns='temp')
   .groupby('Client')
   .head(1)
)
     Client Priority
1  Client 1   Medium
3  Client 2     High
4  Client 3      Low
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading