Home pandas dataframe custom sort function

Questions

pandas dataframe custom sort function

March 5, 2024

I am sorry if it has been explained before, however, I still can’t find a proper solution.
I have this short code with pandas.DataFrame:

import pandas as pd


table = {
    "key4": ["key3", "command4"],
    "key2": ["key1", "command2"],
    "key3": ["cron3", "command3"],
    "key5": ["cron5", "command5"],
    "key1": ["cron1", "command1"]   
}

columns = ["trigger", "command"]
df = pd.DataFrame.from_dict(table, orient='index', columns=columns)

I would like to order each row under its parent, in other words if key1 is found in the trigger column it should be ordered right after the row with the name key1. (I only expect a single occurrance of key1 both as a name and as a value for a trigger).
Or is it an overkill and should I try using a different format?
So the output of print(df) should look like this:

     trigger    command

key3   cron3    command3
key4    key3    command4
key5   cron5    command5
key1   cron1    command1
key2    key1    command2

Can I somehow import a function into df.sort_values() somehow so that this sorting were more customized? Thank you in advance.

>Solution :

You can use masks and numpy.lexsort:

import numpy as np

m = df['trigger'].isin(df.index)
key = df['trigger'].where(m, df.index)

out = df.iloc[np.lexsort([m, key])]

Or in pure pandas (less elegant IMO, but more flexible if you need a custom order; see below):

out = (df
   .assign(m=df['trigger'].isin(df.index),
           key=lambda d: d['trigger'].where(m, d.index))
   .sort_values(by=['key', 'm'])
   .drop(columns=['m', 'key'])
)

Output:

     trigger   command
key1   cron1  command1
key2    key1  command2
key3   cron3  command3
key4    key3  command4
key5   cron5  command5

Intermediates:

# before sorting
     trigger   command      m   key
key4    key3  command4   True  key3
key2    key1  command2   True  key1
key3   cron3  command3  False  key3
key5   cron5  command5  False  key5
key1   cron1  command1  False  key1

# after sorting
     trigger   command      m   key
key1   cron1  command1  False  key1
key2    key1  command2   True  key1
key3   cron3  command3  False  key3
key4    key3  command4   True  key3
key5   cron5  command5  False  key5

If you want to keep the original order of the groups (key3 -> key5 -> key1):

out = (df
   .assign(m=df['trigger'].isin(df.index),
           key=lambda d: pd.Categorical(d['trigger'].where(m, d.index),
                                       categories=d.index[~d['m']].unique(),
                                        ordered=True)
          )
   .sort_values(by=['key', 'm'])
   .drop(columns=['m', 'key'])
)

Variant:

m = df['trigger'].isin(df.index)
key = df['trigger'].where(m, df.index)
order = df.index[~m].unique()

tmp = df.assign(key=pd.Categorical(key, categories=order, ordered=True))

out = (pd.concat([tmp[~m], tmp[m]]).sort_values(by='key', kind='stable')
         .drop(columns='key')
      )

Output:

     trigger   command
key3   cron3  command3
key4    key3  command4
key5   cron5  command5
key1   cron1  command1
key2    key1  command2

dataframe

byMR

Published March 05, 2024

Add a comment

bash: while read -r using <<< not working on macos

byMR

March 5, 2024

Questions

Gsub with multiple special characters

byMR

March 5, 2024

Questions

My cookie is not displayed in my response React Axios

byMR

March 5, 2024

Questions

Regex that finds both integers, decimals, and thousands separators

byMR

March 5, 2024

Questions

What is C++ doing when I use the comma operator in the initializer expressions?

byMR

March 5, 2024

Questions

How can I parse a string and create a dynamic bag out of it?

byMR

March 5, 2024

pandas dataframe custom sort function

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

bash: while read -r using <<< not working on macos

Gsub with multiple special characters

My cookie is not displayed in my response React Axios

Regex that finds both integers, decimals, and thousands separators

What is C++ doing when I use the comma operator in the initializer expressions?

How can I parse a string and create a dynamic bag out of it?

Keep Up to Date with the Most Important News

pandas dataframe custom sort function

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

bash: while read -r using <<< not working on macos

Gsub with multiple special characters

My cookie is not displayed in my response React Axios

Regex that finds both integers, decimals, and thousands separators

What is C++ doing when I use the comma operator in the initializer expressions?

How can I parse a string and create a dynamic bag out of it?

Discover more from Dev solutions