Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Poisson fit curve over histogram plot

I ‘d like to fit my empirical data to a poisson distribution curve.

I have the mean given value, say 2.3, and data (empirical).

def fit_poisson(data=None,network=None,mu=2.3):
    sns.set_theme()
    fig, ax = plt.subplots(1, 1)

    x = np.arange(poisson.ppf(0.01, mu),
                  poisson.ppf(0.99, mu))
    
    sns.histplot(data, stat='density')
    
    plt.plot(x, poisson.pmf(x, mu))

It plots:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

enter image description here


Apparently, there’s is a range issue in y, here. Maybe a problem with lambda? How do I properly fit my empirical histogram to a poisson distribution curve of same mean?

>Solution :

Poisson random variables are discrete: their y value is "probability" not "density". But the default behavior of histplot avoids guessing that you have discrete data, and it is choosing bins with binwidth < 1 in this case.

Because density normalization forces the area of all bars to sum to 1, that means the density value for the bar containing observations of a certain value will be greater than the probability mass on that value.

There are two relevant parameters here:

  • stat="probability" will make the heights of the bars sum to 1, so they will match the PMF (assuming binwidth < 2, so that only one unique value appears in each bar)

  • discrete=True, which sets binwidth=1 (and aligns the center of each bar with integral values)

    sns.histplot(data, stat=’probability’, discrete=True, shrink=.8)

enter image description here

I’ve also added shrink=0.8, which draws the bars a bit narrower than the binwidth; this helps emphasize the discrete nature of the data.

(Note that with discrete=True (implying binwidth=1), density and probability normalization will do the same thing so that’s actually all you need, but Probability is the right y axis label to use here).

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading