I have a dataframe in pandas that looks like
col_1 col_2
6 A
2 A
5 B
3 C
5 C
3 B
6 A
6 A
2 B
2 C
5 A
5 B
and i want to add a new column col_new that counts the number of rows with the same elements in col_1 and col_2 but excluding that row itself. So the desired output would look like
col_1 col_2 col_new
6 A 2
2 A 0
5 B 1
3 C 0
5 C 0
3 B 0
6 A 2
6 A 2
2 B 0
2 C 0
5 A 0
5 B 1
Here what’s I tried but I am not sure if it’s the right approach:
df['col_new'] = df.groupby(['col_1', 'col_2']).count()
But then I got the error: TypeError: incompatible index of inserted column with frame index
Thanks in advance.
>Solution :
You can use:
df['col_new'] = df.groupby(['col_1', 'col_2'])['col_2'].transform('count').sub(1)
Output:
col_1 col_2 col_new
0 6 A 2
1 2 A 0
2 5 B 1
3 3 C 0
4 5 C 0
5 3 B 0
6 6 A 2
7 6 A 2
8 2 B 0
9 2 C 0
10 5 A 0
11 5 B 1