Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Overlaying probability density functions on one plot

I would like to create a probability density function for the isotopic measurements of N from three NOx sources. The number of measurements varies between sources, so I’ve created three dataframes. Here is the code:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
#import matplotlib.ticker as plticker
#from matplotlib.ticker import (MultipleLocator, AutoMinorLocator)


df = pd.DataFrame({
    'Mobile':[15.6, 14.2, 14.4, 10.2, 13.1, 12.8, 13.3, 16.9, 15.8, 15.3, 16.9, 15.6, 15.6, 17, 16, 15.1, 15, 14.4,
              14.6, 16.2, 15.3, 16.4, -0.4, -2.9, 1.6, 9.8, 1.6, -8.1, -4.4, -0.4, 8.6]})
    
df1 = pd.DataFrame({
    'Soil':[-47, -37, -29, -26, -25, -24, -31, -23, -22, -19, -49, -42, -44, -37, -29, -29, -32, -31, -29, -28,
            -26.5, -30.8]})
df2 = pd.DataFrame({
    'Biomass Burning':[-2.7, -5, -5.9, -7.2, 3.2, 2.6, 3.8, 8.1, 12, 0.9, 1.3, 1.6, -1.5, -1.3, -0.1, 0.5, 4.4, 2,
                       2.9, 1.7, 3.2, 1.6, -0.3, -0.9]})

fig = plt.figure()
ax = fig.add_subplot()
ax.hist([df, df1, df2], label = ("Mobile", "Soil", "Biomass Burning"), bins=25, stacked=True, range=[0,25])

The problem is that I get an error message that says: ValueError: x must have 2 or fewer dimensions. I’ve tried a "fatten" method but get an error message that says AttributeError: 'DataFrame' object has no attribute 'flatten'. I am unsure of what to try next to get the code to run and could use some help. I am also thinking that hist might be the wrong function to use since I want a probability density distribution. I’ve also tried:

sns.displot(data=[df,df1,df2], x=['Mobile','Soil','Biomass Burning'], hue='target', kind='kde', 
            fill=True, palette=sns.color_palette('bright')[:3], height=5, aspect=1.5)

But again, I run into the issue of the dataframes being different lengths. Thanks!

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

One option is to melt the dataframes, concat them, and then use hue with displot:

dfs  = pd.concat([df.melt(), df1.melt(), df2.melt()], ignore_index=True)
sns.displot(data=dfs, x='value', hue='variable', kind='kde')

Output:

enter image description here

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading