I have 3 users s1 who has 10 dollars, s2 10,20 dollars, and s3 20,20,30 dollars. I want to calculate percentage of users who had 10, 20 and 30 dollars. Is my interpretation correct here?
input
import pandas as pd
df1 = (pd.DataFrame({'users': ['s1', 's2', 's2', 's3', 's3', 's3'],
'dollars': [10,10,20,20,20,30]}))
output
% of subjects who had 10 dollors 0.4
% of subjects who had 20 dollors 0.4
% of subjects who had 30 dollors 0.2
tried
df1.groupby(['dollars']).agg({'dollars': 'sum'}) / df1['dollars'].sum() * 100
>Solution :
to get the percentage of users that have each kind of bill you can use a crosstab
:
out = pd.crosstab(df1['users'], df1['dollars']).gt(0).mean().mul(100)
output:
dollars
10 66.666667
20 66.666667
30 33.333333
dtype: float64
If you want normalized counts:
out/out.sum()
Output:
dollars
10 0.4
20 0.4
30 0.2
dtype: float64