Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to alphabetically sort values in a pandas column that is a list

I had a question about sorting values in a list in a dataframe. I have sales order data with about 3000 rows. A sales order can contain one or multiple inventory items. I reshaped my data and was able to create a dataset like this:

Current result:

sales_order bundle
so-1 apple, pear, tomato
so-2 bread, fish
so-3 fish, bread
so-4 pear, tomato, apple
so-5 tomato

I would like to sort each ‘bundle’ value alphabetically by row, so I can group the resulting string, quantify, and visualize in tableau against other data in my dataframe, but am having trouble doing so.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I tried various things like a lambda function and ‘sort’, but that ruined the data. I also tried using sort values on the bundle column, and this helped in certain ways, but did not sort the items in the list alphabetically.

sales_order bundle
so-1 apple, pear, tomato
so-2 bread, fish
so-3 bread, fish
so-4 apple, pear, tomato
so-5 tomato

is there any way to parse through each ‘bundle’ column value and sort the value alphabetically?

I did try
z = df.copy()
z = z[‘bundle’].sort_values()

this works for my sample data I provided in this question, but only because the data is oversimplified. It doesnt work on my actual data.

here is an example where the sort_values is not sorting my data:

    testdf = pd.DataFrame({'sales_order': ['so-1','so-2'],\
    'Inv_item': ['Revolve 3 Clean 110cm,Resolution 3 Glare - 120bp- Normal viscosity',\
    'Resolution 3 Glare - 120bp - Normal viscosity,Revolve 3 Clean 110cm']})

    testdf['Inv_item'].sort_values()

    1    Resolution 3 Glare - 120bp - Normal viscosity,...
    0    Revolve 3 Clean 110cm,Resolution 3 Glare - 120...
    Name: Inv_item, dtype: object

Thanks for reading/helping!

>Solution :

You could use the below to sort each value:

df['bundle'].str.split(', ').map(sorted).str.join(', ')

Another option is to use frozenset which can be used in groupby() It wont sort your values, but could be useful.

df['bundle'].str.split(', ').map(frozenset)

Output:

0    apple, pear, tomato
1            bread, fish
2            bread, fish
3    apple, pear, tomato
4                 tomato
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading