Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Sort dataframe based on minimum value of two columns

Let’s assume I have the following dataframe:

import pandas as pd
d = {'col1': [1, 2,3,4], 'col2': [4, 2, 1, 3], 'col3': [1,0,1,1], 'outcome': [1,0,1,0]}
df = pd.DataFrame(data=d)

I want this dataframe sorted by col1 and col2 on the minimum value. The order of the indexes should be 2, 0, 1, 3.

I tried this with df.sort_values(by=['col2', 'col1']), but than it takes the minimum of col1 first and then of col2. Is there anyway to order by taking the minimum of two columns?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Using numpy.lexsort:

order = np.lexsort(np.sort(df[['col1', 'col2']])[:, ::-1].T)

out = df.iloc[order]

Output:

   col1  col2  col3  outcome
2     3     1     1        1
0     1     4     1        1
1     2     2     0        0
3     4     3     1        0

Note that you can easily handle any number of columns:

df.iloc[np.lexsort(np.sort(df[['col1', 'col2', 'col3']])[:, ::-1].T)]

   col1  col2  col3  outcome
1     2     2     0        0
2     3     1     1        1
0     1     4     1        1
3     4     3     1        0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading