Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to create a new column that gets count by groupby in pandas

I have a dataset with Employee history. Including Name, ID number, Job, Manager, Manager ID, and Termed(1 = Termed or 0 = Active). Here is a sample.

df =

Name        Emp_ID      Job        Manager      Manager_ID      Termed 
Adam         100       Sales       Steve           103            0
Beth         101       Sales       Steve           103            0 
Rick         102       Tech        John            106            0 
Steve        103       Sales Mgr.  Lisa            110            0
Drake        104       Tech        John            106            1
Sarah        105       Sales       Steve           103            1
John         106       Tech Mgr.   Rodger          122            0
Mike         107       Sales       Steve           103            1 

I would like to findout by manager how many 1 or Terminated employees each manager has.
The code I tried was df['term_count'] = df.groupby('Manager_ID')['Termed'].sum()
When I try this code I get a NaN value when I’m not sure why. This is the desired output.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df =

Name     Emp_ID     Job       Manager     Manager_ID       Termed      term_count
Adam      100      Sales       Steve        103               0             0
Beth      101      Sales       Steve        103               0             0
Rick      102      Tech        John         106               0             0 
Steve     103      Sales Mgr.  Lisa         110               0             2
Drake     104      Tech        John         106               1             0
Sarah     105      Sales       Steve        103               1             0 
John      106      Tech Mgr.   Rodger       122               0             1
Mike      107      Sales       Steve        103               1             0

Any suggestions would be great thank you!!

>Solution :

Use additional pd.Series.map on aggregated sums:

df['term_count'] = df['Emp_ID'].map(df.groupby('Manager_ID')['Termed'].sum()).fillna(0)

    Name  Emp_ID         Job Manager  Manager_ID  Termed  term_count
0   Adam     100       Sales   Steve         103       0         0.0
1   Beth     101       Sales   Steve         103       0         0.0
2   Rick     102        Tech    John         106       0         0.0
3  Steve     103  Sales Mgr.    Lisa         110       0         2.0
4  Drake     104        Tech    John         106       1         0.0
5  Sarah     105       Sales   Steve         103       1         0.0
6   John     106   Tech Mgr.  Rodger         122       0         1.0
7   Mike     107       Sales   Steve         103       1         0.0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading