Add incremental counter for repeating feature subsets in pandas

May 17, 2022

Suppose I have the following table:

id	name	mail	date
1	Sta	sta@example.com	11.11.22
2	Danny	dany@example.com	11.11.22
3	Elle	elle@example.com	11.11.22
4	Elle	falsemail@example.com	11.11.22
5	Elle	elle@example.com	12.11.22

What is the best way to create an incremental counter for repeating observations for the feature subset [name, date]?

Desired output:

id	name	mail	date	counter
1	Sta	sta@example.com	11.11.22	1
2	Danny	dany@example.com	11.11.22	1
3	Elle	elle@example.com	11.11.22	1
4	Elle	falsemail@example.com	11.11.22	2
5	Elle	elle@example.com	12.11.22	1

Edit: The table itself is sorted correctly and the duplicates appear after each other.

>Solution :

df['counter'] = df.groupby(['name', 'date']).cumcount() + 1 
df
   id   name                   mail        date  counter
0  1    Sta         sta@example.com   11.11.22         1
1  2   Danny       dany@example.com   11.11.22         1
2  3   Elle        elle@example.com   11.11.22         1
3  4   Elle   falsemail@example.com   11.11.22         2
4  5   Elle        elle@example.com   12.11.22         1