Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to aggregate 3 columns in DataFrame to have count and distribution of values in separated columns in Python Pandas?

I have Pandas DataFrame like below:

data types:

  • ID – int

    MEDevel.com: Open-source for Healthcare and Education

    Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

    Visit Medevel

  • TIME – int

  • TG – int

    ID TIME TG
    111 20210101 0
    111 20210201 0
    111 20210301 1
    222 20210101 0
    222 20210201 1
    333 20210201 1

And I need to aggregate above DataFrame so as to know:

  1. how many IDs are per each value in TIME
  2. how many "1" from TG are per each value in TIME
  3. how many "0" from TG are per each value in TIME

So I need to something like below:

TIME     | num_ID | num_1 | num_0
---------|--------|-------|--------
20210101 | 2      | 0     | 2
20210201 | 3      | 2     | 1
20210301 | 1      | 1     | 0

How can I do that in Python Padas ?

>Solution :

Use GroupBy.size for counts TIME values with crosstab for count number of 0 and 1 values:

df1 = (df.groupby('TIME').size().to_frame('num_ID')
         .join(pd.crosstab(df['TIME'], df['TG']).add_prefix('num_'))
         .reset_index())
print (df1)
       TIME  num_ID  num_0  num_1
0  20210101       2      2      0
1  20210201       3      1      2
2  20210301       1      0      1

Another idea if need count only 0 and 1 values in GroupBy.agg:

df1 = (df.assign(num_0 = df['TG'].eq(0),
                num_1 = df['TG'].eq(1))
        .groupby('TIME').agg(num_ID = ('TG','size'),
                             num_1=('num_1','sum'),
                             num_0=('num_0','sum'),
                             )
        .reset_index()
        )
print (df1)
       TIME  num_ID  num_1  num_0
0  20210101       2      0      2
1  20210201       3      2      1
2  20210301       1      1      0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading