While doing EDA of Titanic dataset in Kaggle I combined "Parch" and "SibSp" values into single feature "relative" containing total no of relatives of each passenger.
data['relative'] = data['Parch'] + data['SibSp']
On plotting barplot relatives vs Survived using Seaborn I have got following output
Query
I can’t understand this barplot as
Survived feature in this dataset has only two values 1 and 0 and relative also has 6 values ranging from 1 to 6 so how this barplot is generated and What information it is showing?
>Solution :
From seaborn.barplot documentation.
help(seaborn.barplot)
The plot is showing you the mean of the ‘Survived’ values per bin (‘N of relatives’). You could specify the estimator parameter to change this behaviour (e.g. estimator=median).
A bar plot represents an estimate of central tendency for a numeric
variable with the height of each rectangle and provides some
indication of the uncertainty around that estimate using error bars.
Bar plots include 0 in the quantitative axis range, and they are a
good choice when 0 is a meaningful value for the quantitative
variable, and you want to make comparisons against it.
