Suppose I am having a data like this:
Length Width Height
100 140 100
120 150 110
140 160 120
160 170 130
170 190 140
200 200 150
210 210 160
220 220 170
Now, I want to know the distribution of data in each column with a certain increment
For example:
If I want to see the distribution of data in Length column from 100 to 160 with an increment of 30 and I want to see the output like
Min Max count Percentage Remaining values(out the range which we have given)
100 130 1 12.5 7
130 160 2 25 5
And how to draw the bar graph from it?
Please help
>Solution :
IIUC, you could use pandas.cut:
(df.groupby(pd.cut(df['Length'], bins=[100,130,160]))
['Length'].agg(count='count')
.assign(**{'Remaining value': lambda d: len(df)-d['count'],
'Percentage': lambda d: d['count']/len(df)*100,
})
)
output:
count Remaining value Percentage
Length
(100, 130] 1 7 12.5
(130, 160] 2 6 25.0
For graphing, you can do it automatically with many libraries.
Example with seaborn:
import seaborn as sns
sns.histplot(df, bins=[100,130,160,190,220])
output:
or
sns.displot(df.melt(), x='value', col='variable',
kind='hist', bins=[100,130,160,190,220])
output:

