Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas dataframe: change duplicated rows so the first duplicates are in consecutive order

I have a dataframe with a ton of rows and columns. The following dataframe is just a simplified version of it.

dct = {'day': ['Mon', 'Tue', 'Wed', 'Wed', 'Thur', 'Fri', 'Fri', 'Sat', 'Sun'],
       'id': ['1', '2', '3', '4', '5', 
              '6', '7', '8', '9']}

df = pd.DataFrame(dct)
    day id
0   Mon  1
1   Tue  2
2   Wed  3
3   Wed  4
4  Thur  5
5   Fri  6
6   Fri  7
7   Sat  8
8   Sun  9

I want to change the ids to match the duplicated days (in this scenario) but still be consecutive.

    day id
0   Mon  1
1   Tue  2
2   Wed  3
3   Wed  3
4  Thur  4
5   Fri  5
6   Fri  5
7   Sat  6
8   Sun  7

So far i managed to change the ids but they are not consecutive and i’m kinda stuck right now.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

m = df['day'].duplicated()
df['id'] = df['id'].mask(m).ffill()
    day id
0   Mon  1
1   Tue  2
2   Wed  3
3   Wed  3
4  Thur  5
5   Fri  6
6   Fri  6
7   Sat  8
8   Sun  9

>Solution :

You can subtract a cumsum of the duplicated values:

# ensure using numeric dtype
df['id'] = df['id'].astype(int)

# subtract duplicated cumsum
df['id'] -= df['day'].duplicated().cumsum()

Output:

    day  id
0   Mon   1
1   Tue   2
2   Wed   3
3   Wed   3
4  Thur   4
5   Fri   5
6   Fri   5
7   Sat   6
8   Sun   7

Intermediates:

    day  id  duplicated  cumsum  id-cumsum
0   Mon   1       False       0          1
1   Tue   2       False       0          2
2   Wed   3       False       0          3
3   Wed   4        True       1          3
4  Thur   5       False       1          4
5   Fri   6       False       1          5
6   Fri   7        True       2          5
7   Sat   8       False       2          6
8   Sun   9       False       2          7
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading