Plot mean of variable by group with specified number of instances in each bar

I am using the following code to plot a bar chart:

df.groupby(['Borrow_Rank'])['Outcome'].mean().plot(kind = 'bar')

This is showing the mean of the Outcome variable that ranges from 0-1. However, I need the bar chart to also show how many instances or rows are in each group (2000-5000 in each group). I appreciate any ideas.

Thank you in advance!

>Solution :

How you want the data displayed is unclear, but you could use several aggregation functions:

df.groupby(['Borrow_Rank'])['Outcome'].agg(['mean', 'count']).plot.bar()

output:

enter image description here

Other option, annotate the counts as text:

g = df.groupby(['Borrow_Rank'])['Outcome']
s = g.mean()
ax = s.plot.bar()

for x, (y, count) in enumerate(zip(s, g.count())):
    ax.annotate(f'n = {count}', (x, y), ha='center', va='bottom')

output:

enter image description here

reproducible dummy input:

df = pd.DataFrame({'Borrow_Rank': list('AABABACC'), 'Outcome': range(8)})

Leave a Reply