I have a dataframe as below:
data = {'Test': [' ', ' ', 'K', ],
'Name': ['A', 'B', 'B', 'B'],
'value': ['D1', 'A1', ' ', 'C1'],
'time': [227, 227, 227, 230]}
df = pd.DataFrame(data)
Test Name value time
0 A D1 227
1 B A1 227
2 K B 227
3 B C1 230
And I want to make the df to sort as:
Test Name value time
0 A D1 227
1 K B 227
2 B A1 227
3 B C1 230
I’ve tried using sort_values, but still can’t figure it out. Or should I add more condition for the empty field ‘ ‘(or NA) in sort?
>Solution :
Better not use empty strings/spaces to denote empty cells. Use NAs/NaNs that are directly handled by sort_values:
out = (df.replace(' ', pd.NA)
.sort_values(by=['Name', 'time', 'Test'])
)
Which is equivalents to:
out = (df.replace(' ', pd.NA)
.sort_values(by=['Name', 'time', 'Test'], na_position='last')
)
Output:
Test Name value time
0 <NA> A D1 227
2 K B <NA> 227
1 <NA> B A1 227
3 <NA> B C1 230