Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Sort dataframe include empty field

I have a dataframe as below:

data = {'Test': [' ', ' ', 'K', ],
        'Name': ['A', 'B', 'B', 'B'],
        'value': ['D1', 'A1', ' ', 'C1'],
        'time': [227, 227, 227, 230]}
df = pd.DataFrame(data)
  Test  Name value  time
0        A    D1    227
1        B    A1    227
2   K    B          227
3        B    C1    230

And I want to make the df to sort as:

  Test  Name value  time
0        A    D1    227
1   K    B          227
2        B    A1    227
3        B    C1    230

I’ve tried using sort_values, but still can’t figure it out. Or should I add more condition for the empty field ‘ ‘(or NA) in sort?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Better not use empty strings/spaces to denote empty cells. Use NAs/NaNs that are directly handled by sort_values:

out = (df.replace(' ', pd.NA)
         .sort_values(by=['Name', 'time', 'Test'])
       )

Which is equivalents to:

out = (df.replace(' ', pd.NA)
         .sort_values(by=['Name', 'time', 'Test'], na_position='last')
       )

Output:

   Test Name value  time
0  <NA>    A    D1   227
2     K    B  <NA>   227
1  <NA>    B    A1   227
3  <NA>    B    C1   230
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading