Unable to get the count of the data based on other columns in pandas

Below is the dataframe

AGMTNO  Employee_id POS Cust_Pincode    Pincode VaR RESOLVED_COUNT  P_Dealer_ArielDistance
0   MP3012TW0087178 5035496 606.0   487551  487551  29.070673   1   0.000000
1   MP3012TW0087282 5035496 525.0   470661  470661  30.075669   0   23.100406
2   TN3005TW0123515 5035496 874.0   639207  624620  34.000748   1   21.645896

I’m trying to write a function or code where it’ll groupby on Employee_id, get the count of AGMTNO and count of AGMTNO where Cust_Pincode and Pincode is not matching

The expected output is below

Below is the exptected output

    FCE_Employee_id No_Of_Customers No_Of_Customers_not_matching
0   5035496         3                1

The code which i’ve tried is below

df.groupby('FCE_Employee_id').agg({'AGMTNO': 'count',
                                   'Cust_Pincode': lambda x: (x != x.shift()).sum()})

But the above code is not working.

>Solution :

Simply assign first a column to check for matches, then groupby.agg:

(df
 .assign(match=lambda d: d['Cust_Pincode'].ne(d['Pincode']))
 .groupby('Employee_id', as_index=False)
 .agg(**{'No_Of_Customers': ('AGMTNO', 'count'),
         'No_Of_Customers_not_matching': ('match', 'sum')
        })
)

output:

   Employee_id  No_Of_Customers  No_Of_Customers_not_matching
0      5035496                3                             1

Leave a Reply