I’d like to ask how to draw the Probability Density Function (PDF) plot in Python.
This is my codes.
import numpy as np
import pandas as pd
from pandas import DataFrame
import matplotlib.pyplot as plt
import scipy.stats as stats
.
x = np.random.normal(50, 3, 1000)
source = {"Genotype": ["CV1"]*1000, "AGW": x}
df=pd.DataFrame(source)
df
I generated a data frame. Then, I tried to draw a PDF graph.
df["AGW"].sort_values()
df_mean = np.mean(df["AGW"])
df_std = np.std(df["AGW"])
pdf = stats.norm.pdf(df["AGW"], df_mean, df_std)
plt.plot(df["AGW"], pdf)
I obtained above graph. What I did wrong? Could you let me how to draw the Probability Density Function (PDF) Plot which is also known as normal distribution graph.
Could you let me know which codes (or library) I need to use to draw the PDF graph?
Always many thanks!!
>Solution :
You just need to sort the values
pdf = stats.norm.pdf(df["AGW"].sort_values(), df_mean, df_std)
plt.plot(df["AGW"].sort_values(), pdf)
And it will work.
The line df["AGW"].sort_values() doesn’t change df. Maybe you meant df.sort_values(by=['AGW'], inplace=True).
In that case the full code will be :
import numpy as np
import pandas as pd
from pandas import DataFrame
import matplotlib.pyplot as plt
import scipy.stats as stats
x = np.random.normal(50, 3, 1000)
source = {"Genotype": ["CV1"]*1000, "AGW": x}
df=pd.DataFrame(source)
df.sort_values(by=['AGW'], inplace=True)
df_mean = np.mean(df["AGW"])
df_std = np.std(df["AGW"])
pdf = stats.norm.pdf(df["AGW"], df_mean, df_std)
plt.plot(df["AGW"], pdf)
Which gives :


