I have a big dataframe and I created a temperature range column by using pd.cut. This is fine. Now I want to know the minimum range in that min-max range column. So, I can use this column to sort the dataframe
My code:
# Goal: sort below dataframe by the 'temp_range' columns
# The column should be sorted as '-60-50','-10-0','0-10','20-30'
xdf = pd.DataFrame(data={'temp_range':['-10-0','20-30','-60-50','0-10']})
xdf['Min. temp range']= xdf['temp_range'].apply(lambda x:x[:3])
xdf
Present solution:
temp_range Min. temp range
0 -10-0 -10
1 20-30 20-
2 -60-50 -60
3 0-10 0-1
Expected solution:
temp_range Min. temp range
0 -10-0 -10
1 20-30 20
2 -60-50 -60
3 0-10 0
Sort this expected solution by the Min. temp range column
xdf.sort_values('Min. temp range')
temp_range Min. temp range
0 -60-50 -60
1 -10-0 -10
2 0-10 0
3 20-30 20
>Solution :
Use str.extract:
xdf['Min. temp range'] = xdf['temp_range'].str.extract('^(-?\d+)')
Output:
temp_range Min. temp range
0 -10-0 -10
1 20-30 20
2 -60-50 -60
3 0-10 0
If you don’t need the column and just want to sort:
xdf.sort_values(by='temp_range', key=lambda s: pd.to_numeric(s.str.extract('^(-?\d+)', expand=False)))
Output:
temp_range
2 -60-50
0 -10-0
3 0-10
1 20-30