Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Group by columns under conditions to calculate average

I have a raw dataset like below:

ColA ColB duration interval Counter
A SD 2 4 1
A SD 3 3 2
A UD 2 1 10
B UD 1 2 2
B UD 2 2 2
B SD 3 3 13
B SD 1 4 19

I am expecting an output result like below:

enter image description here

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Explanation of the output:

  1. SumCounter is the sum of counter values on the group by ColA and ColB values.
  2. AvgdurationSD/UD and AvgIntervalSD/UD are created by taking the average over ColA and ColB and having a 0 value incase the columns dont match the criteria (e.g. AvgDurationSD and AvIntervalSD has 0 value for a group of ColA = A and ColB = UD.

I understand that I have to use group by and agg functions to apply here but I am not really sure how to apply conditions for ColB on individual new columns.

Any help is appreciated:)

>Solution :

Use DataFrame.pivot_table with helper column new by copy like ColB, then flatten MultiIndex and add ouput to new DataFrame created by aggregate sum:

df1 = (df.assign(new=df['ColB'])
         .pivot_table(index=['ColA', 'ColB'], 
                      columns='new', 
                      values=['interval','duration'], 
                      fill_value=0,
                      aggfunc='mean'))
df1.columns = df1.columns.map(lambda x: f'{x[0]}{x[1]}')
df = (df.groupby(['ColA','ColB'])['Counter']
        .sum()
        .to_frame(name='SumCounter')
        .join(df1).reset_index())
print (df)
  ColA ColB  SumCounter  durationSD  durationUD  intervalSD  intervalUD
0    A   SD           3         2.5         0.0         3.5           0
1    A   UD          10         0.0         2.0         0.0           1
2    B   SD          32         2.0         0.0         3.5           0
3    B   UD           4         0.0         1.5         0.0           2
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading