Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

replace values in a category column based on the order they appear

I have a DataFrame with 1k records similar to the below one. Column B is a group column in sorted order.

import pandas as pd
df = pd.DataFrame([['red', 0], ['green', 0], ['blue', 16],['white', 58],['yellow', 59], ['purple', 71], ['violet', 82],['grey', 82]], columns=['A','B'])
df
      A     B
0   red     0
1   green   0
2   blue    16
3   white   58
4   yellow  59
5   purple  71
6   violet  82
7   grey    82

How could I update column B to get an output like the below one (to use the column as a category later)?

output_df = pd.DataFrame([['red', 'group1'], ['green', 'group1'], ['blue', 'group2'],['white', 'group3'],['yellow', 'group4'], ['purple', 'group5'], ['violet', 'group6'],['grey', 'group6']], columns=['A','B'])
output_df


          A     B
0   red     group1
1   green   group1
2   blue    group2
3   white   group3
4   yellow  group4
5   purple  group5
6   violet  group6
7   grey    group6

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You can use .ne(), .shift(), and .cumsum() to get the group numbers, and then use string manipulation to get the desired result:

df["B"] = "group" + df["B"].ne(df["B"].shift(1)).cumsum().astype(str)

This outputs:

        A       B
0     red  group1
1   green  group1
2    blue  group2
3   white  group3
4  yellow  group4
5  purple  group5
6  violet  group6
7    grey  group6
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading