I’m new to python and pandas. I have a data set with the columns age, sex, bmi, children, smoker, region, charges. I want to write a query to count all the smokers that answered "yes" based on their region. the regions can only be northwest, northeast, southwest, southeast.
I have tried several groupby and series commands but i’m not getting it. Can anyone help? Thanks in advance
I’ve tried:
data.groupby('Region').count()
data.groupby('Region').apply(lambda g: pd.Series(g['Smoker'].str.contains("y").count()))
data['Smoker'].value_counts().reindex(['Region'])
None of them worked.
>Solution :
You can go for this :
df.groupby(['region', 'smoker']).count()