Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to sort Pandas dataframe by column using the key argument

Assume a Pandas data frame (for the sake of simplicity, let’s say with three columns). The columns are titled A, B and d.

$ import pandas as pd
$ df = pd.DataFrame([[1, 2, "a"], [1, "b", 3], ["c", 4, 6]], columns=['A', 'B', 'd'])
$ df
   A  B  d
0  1  2  a
1  1  b  3
2  c  4  6

Further assume that I wish to sort the data frame so that the columns have exactly the following order: d, A, B. The rows of the data frame shall not be rearranged in any way. The desired output is:

$ col_target_order = ['d', 'A', 'B']
$ df_desired
   d  A  B
0  a  1  2
1  3  1  b
2  6  c  4

I know that this can be done via the sort_index function of pandas. However, the following won’t work, as the input list (col_target_order) is not callable:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

$ df.sort_index(axis=1, key=col_target_order)

What key specification do I have to use?

>Solution :

Don’t sort, just index:

out = df[col_target_order]

For the sake of the argument, you could sort_index with a crafted Series as key:

df.sort_index(axis=1, key=pd.Series(range(len(col_target_order)), index=col_target_order).get)

Or an Index indexer:

df.sort_index(axis=1, key=pd.Index(col_target_order).get_indexer)

Output:

   d  A  B
0  a  1  2
1  3  1  b
2  6  c  4
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading