I have a dataframe like as below
id,status,amount,qty 1,pass,123,4500 1,pass,156,3210 1,fail,687,2137 1,fail,456,1236 2,pass,216,324 2,pass,678,241 2,nan,637,213 2,pass,213,543 df = pd.read_clipboard(sep=',')
I would like to do the below
id and compute the pass percentage for each id
id and compute the average
amount for each id
So, I tried the below
df['amt_avg'] = df.groupby('id')['amount'].mean() df['pass_pct'] = df.groupby('status').apply(lambda x: x['status']/ x['status'].count()) df['fail_pct'] = df.groupby('status').apply(lambda x: x['status']/ x['status'].count())
but this doesn’t work.
I am having trouble in getting the pass percentage.
In my real data I have lot of columns like
status for which I have to find these % distribution of a specific value (ex: pass)
I expect my output to be like as below
id,pass_pct,fail_pct,amt_avg 1,50,50,2770.75 2,75,0,330.25
s = df.groupby('id')['qty'].mean() df = (pd.crosstab(df['id'], df['status'].fillna('nan'), normalize=0) .drop('nan', 1) .mul(100) .join(s.rename('amt_avg'))) print (df) fail pass amt_avg id 1 50.0 50.0 2770.75 2 0.0 75.0 330.25