Poisson fit curve over histogram plot

March 14, 2022

I ‘d like to fit my empirical data to a poisson distribution curve.

I have the mean given value, say 2.3, and data (empirical).

def fit_poisson(data=None,network=None,mu=2.3):
    sns.set_theme()
    fig, ax = plt.subplots(1, 1)

    x = np.arange(poisson.ppf(0.01, mu),
                  poisson.ppf(0.99, mu))
    
    sns.histplot(data, stat='density')
    
    plt.plot(x, poisson.pmf(x, mu))

It plots:

Apparently, there’s is a range issue in y, here. Maybe a problem with lambda? How do I properly fit my empirical histogram to a poisson distribution curve of same mean?

>Solution :

Poisson random variables are discrete: their y value is "probability" not "density". But the default behavior of histplot avoids guessing that you have discrete data, and it is choosing bins with binwidth < 1 in this case.

Because density normalization forces the area of all bars to sum to 1, that means the density value for the bar containing observations of a certain value will be greater than the probability mass on that value.

There are two relevant parameters here:

stat="probability" will make the heights of the bars sum to 1, so they will match the PMF (assuming binwidth < 2, so that only one unique value appears in each bar)
discrete=True, which sets binwidth=1 (and aligns the center of each bar with integral values)

sns.histplot(data, stat=’probability’, discrete=True, shrink=.8)

I’ve also added shrink=0.8, which draws the bars a bit narrower than the binwidth; this helps emphasize the discrete nature of the data.

(Note that with discrete=True (implying binwidth=1), density and probability normalization will do the same thing so that’s actually all you need, but Probability is the right y axis label to use here).