Create a new column based on count of other columns

February 13, 2023

I have a dataframe in pandas that looks like

col_1   col_2
6       A       
2       A       
5       B       
3       C       
5       C       
3       B       
6       A       
6       A       
2       B       
2       C       
5       A       
5       B

and i want to add a new column col_new that counts the number of rows with the same elements in col_1 and col_2 but excluding that row itself. So the desired output would look like

col_1   col_2   col_new
6       A       2
2       A       0
5       B       1
3       C       0  
5       C       0
3       B       0
6       A       2
6       A       2
2       B       0
2       C       0
5       A       0
5       B       1

Here what’s I tried but I am not sure if it’s the right approach:

df['col_new'] = df.groupby(['col_1', 'col_2']).count()

But then I got the error: TypeError: incompatible index of inserted column with frame index

Thanks in advance.

>Solution :

You can use:

df['col_new'] = df.groupby(['col_1', 'col_2'])['col_2'].transform('count').sub(1)

Output:

    col_1 col_2  col_new
0       6     A        2
1       2     A        0
2       5     B        1
3       3     C        0
4       5     C        0
5       3     B        0
6       6     A        2
7       6     A        2
8       2     B        0
9       2     C        0
10      5     A        0
11      5     B        1