Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Very Tricky removing erroneous tags whilst grouping by multiple fields in Pandas

I am looking to groupby several columns if the prefix is similar and take the sum based off of categorical values within a column.

Data

name      type    size   month
AA:3400            5     august
AA:3401   FALSE    1     august
AA:3402   FALSE    2     august
AA:3404   TRUE     0     august
AA:3409   FALSE    1     september
AA:3410   FALSE    8     september
AA:3412   FALSE    9     september
BB:3400   TRUE     4     august
BB:3401   FALSE    7     august

Desired

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

name    type    size  month
AA      TRUE    0     august
AA      FALSE   3     august
AA              5     august
BB      TRUE    4     august
BB      FALSE   7     august
AA      TRUE    0     september
AA      FALSE   18    september

Doing

However, how can I group if the value has the same prefix? Any suggestion is appreciated.

  out = (
    df.assign(type= df["name"].astype(
        pd.CategoricalDtype(["TRUE", "FALSE"], ordered=True)))
      .groupby([df["name", "date"].str.split(":").str[0], "type"],
               dropna=False, group_keys=False)["size"].sum().reset_index()
)

However, I am not sure how to incorporate multiple fields in this grouping. Any suggestion is appreciated.

>Solution :

You can do

out = df.groupby([df["name"].str.split(":").str[0],
                  df['month'],
                  df['type']])['size'].sum().unstack('type',fill_value=0).stack().reset_index(name='size')
Out[61]: 
  name      month   type  size
0   AA     august  FALSE     3
1   AA     august   None     5
2   AA     august   TRUE     0
3   AA  september  FALSE    18
4   AA  september   None     0
5   AA  september   TRUE     0
6   BB     august  FALSE     7
7   BB     august   None     0
8   BB     august   TRUE     4
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading