I have the following dataframe:
import pandas as pd
#Create DF
d = {
'Country': ['USA','USA','AUS','AUS','AUS','UK','UK'],
'poulation_k':[200,250,150,120,350,800,600,],
}
df = pd.DataFrame(data=d)
df
I would like to sort the rows by poulation where Country = AUS but maintaining their order in the overall dataframe:
So my expected output will be:
I would also like to do it by the other countries however i would like to do it on a manual basis – i.e i would like the function to specify the Country name. Any help would be fantastic! Thanks
>Solution :
Use df.sort_values with df.loc:
# Create a function to sort df by population
In [99]: def sort_population(country):
...: # Find index of rows for the country passed
...: ix = df[df.Country.eq(country)].index
...: # Update the df for the above index with the new sorted population
...: df.loc[ix, 'poulation_k'] = df[df.Country.eq(country)].sort_values('poulation_k', ascending=False)['poulation_k'].tolist()
...:
In [101]: sort_population('AUS')
In [102]: df
Out[102]:
Country poulation_k
0 USA 200
1 USA 250
2 AUS 350
3 AUS 150
4 AUS 120
5 UK 800
6 UK 600

