I have the following DataFrame.
name score cond
0 A 10 0
1 A 5 1
2 A 20 2
3 A 30 3
4 B 10 0
5 B 20 0
6 B 30 4
7 B 40 6
What I want to do is group based on name, select the rows with cond == 0, then calculate the average score for those rows and assign that value to a new column for every member of that group.
The end result should look like this.
name score cond group_avg
0 A 10 0 10
1 A 5 1 10
2 A 20 2 10
3 A 30 3 10
4 B 10 0 15
5 B 20 0 15
6 B 30 4 15
7 B 40 6 15
I have been looking into groupby, transform, and filter but I cannot figure out the right combination to use.
>Solution :
Just mask the unwanted values before groupby.transform:
df['group_avg'] = (df['score'].where(df['cond'].eq(0))
.groupby(df['name']).transform('mean')
)
Alternatively, filter, aggregate, map:
df['group_avg'] = df['name'].map(df.loc[df['cond'].eq(0)]
.groupby('name')['score'].mean())
Output:
name score cond group_avg
0 A 10 0 10.0
1 A 5 1 10.0
2 A 20 2 10.0
3 A 30 3 10.0
4 B 10 0 15.0
5 B 20 0 15.0
6 B 30 4 15.0
7 B 40 6 15.0
Intermediate:
df['score'].where(df['cond'].eq(0))
0 10.0
1 NaN
2 NaN
3 NaN
4 10.0
5 20.0
6 NaN
7 NaN
Name: score, dtype: float64