Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python pandas groupby then transform based on condition in another column

I have the following DataFrame.

   name  score  cond
0    A     10    0
1    A      5    1
2    A     20    2
3    A     30    3
4    B     10    0
5    B     20    0
6    B     30    4
7    B     40    6

What I want to do is group based on name, select the rows with cond == 0, then calculate the average score for those rows and assign that value to a new column for every member of that group.

The end result should look like this.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

    name  score  cond group_avg
0    A     10     0       10
1    A      5     1       10
2    A     20     2       10
3    A     30     3       10
4    B     10     0       15
5    B     20     0       15
6    B     30     4       15
7    B     40     6       15

I have been looking into groupby, transform, and filter but I cannot figure out the right combination to use.

>Solution :

Just mask the unwanted values before groupby.transform:

df['group_avg'] = (df['score'].where(df['cond'].eq(0))
                   .groupby(df['name']).transform('mean')
                  )

Alternatively, filter, aggregate, map:

df['group_avg'] = df['name'].map(df.loc[df['cond'].eq(0)]
                                   .groupby('name')['score'].mean())

Output:

  name  score  cond  group_avg
0    A     10     0       10.0
1    A      5     1       10.0
2    A     20     2       10.0
3    A     30     3       10.0
4    B     10     0       15.0
5    B     20     0       15.0
6    B     30     4       15.0
7    B     40     6       15.0

Intermediate:

df['score'].where(df['cond'].eq(0))

0    10.0
1     NaN
2     NaN
3     NaN
4    10.0
5    20.0
6     NaN
7     NaN
Name: score, dtype: float64
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading