Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Fix range of matplotlib histogram with multiple datasets?

I have multiple datasets of that I’d like to combine into a histogram. Unfortunately, my data shows different ranges.

Using the simple range=(0,1) argument triggers weird behaviour.

This is what it should look like:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

enter image description here

and it currently does when the data has the range 0-1.

But sometimes it has not and I get: enter image description here

Here I would like to force the 0-1 range.

But when I do with

axs.hist(plot_data, range=(0,1),color=colors,label=plot_label, histtype='bar')

I get enter image description here

I cannot explain why that happens.

This is the code:

fig, axs = plt.subplots(1,1,tight_layout=True)
fig.set_size_inches(7, 4)
# Define colors for each histogram
colors = ['g', 'b', 'r', 'purple']
# make combined histogram for all error conditions
# plot the histogram
axs.hist(plot_data,color=colors,label=plot_label, histtype='bar')
# Put a legend to the right of the current axis
axs.legend(loc='center left',bbox_to_anchor=(1, 0.5))
# set the axis labels
axs.set_xlabel(f"Similarity: {wildcards.metric}")
axs.set_ylabel("Frequency")
# have the x-axis go increasing from left to right
axs.invert_xaxis()
# ensure proper layout
fig.tight_layout()
# save the histogram
fig.savefig(Path(output.hist))

and some example data is:

[array([
       '1.0', '1.0', '1.0', '1.0', '1.0', '1.0', '1.0', '1.0','1.0',
       '1.0', '1.0', '1.0', '1.0', '1.0', '1.0', '1.0', '1.0', '1.0',
       '1.0', '1.0', '1.0', '1.0', '1.0', '1.0', '1.0', '1.0', '1.0',
       '1.0', '1.0', '1.0'], dtype='<U46'),
 array(['
       1.0', '1.0', '1.0', '1.0', '1.0', '1.0', '1.0', '1.0', '1.0',
       '1.0', '1.0', '1.0', '1.0', '1.0', '1.0', '1.0', '1.0', '1.0',
       '1.0', '1.0', '1.0', '1.0', '1.0', '1.0', '1.0', '1.0', '1.0',
       '1.0', '1.0', '1.0'], dtype='<U46'),
     array([
      '0.25', '0.3', '0.3', '0.3', '0.3', '0.3', '0.0', '0.0', '0.3',
       '0.0', '0.3', '0.3', '0.0', '0.0', '0.3', '0.5', '0.4', '0.0',
       '0.5', '0.5', '0.5', '0.0', '0.4', '0.0', '0.0', '0.6', '0.3',
       '0.1', '0.0', '0.0'], dtype='<U46'),
     array([
       '0.0', '0.0', '0.0', '0.0', '0.0', '0.0', '0.0','0.0','0.0',
       '0.0', '0.0', '0.0', '0.0', '0.0', '0.0', '0.0', '0.0', '0.0',
       '0.0', '0.0', '0.0', '0.0', '0.0', '0.0', '0.0', '0.0', '0.0',
       '0.0', '0.0', '0.0'], dtype='<U46')]

I have tried every possible range and bin combination.

>Solution :

Your arrays are full of string values but you’re passing a range parameter that is expecting float/int values. I’m not sure why your values are strings, but using float values (and not using dtype = '<U46') leads to the graph behaving as expected with the range parameter:

plot_data = [np.array([
       1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,1.0,
       1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
       1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
       1.0, 1.0, 1.0]),
 np.array([
     1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
     1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
     1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
     1.0, 1.0, 1.0]),
 np.array([
     0.25, 0.3, 0.3, 0.3, 0.3, 0.3, 0.0, 0.0, 0.3,
     0.0, 0.3, 0.3, 0.0, 0.0, 0.3, 0.5, 0.4, 0.0,
     0.5, 0.5, 0.5, 0.0, 0.4, 0.0, 0.0, 0.6, 0.3,
     0.1, 0.0, 0.0]),
 np.array([
     0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,0.0,0.0,
     0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
     0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
     0.0, 0.0, 0.0])]

fig, axs = plt.subplots(1,1,tight_layout=True)
fig.set_size_inches(7, 4)
# Define colors for each histogram
colors = ['g', 'b', 'r', 'purple']
plot_label = ['IDEAL', 'TYPICAL', 'LARGE', 'EXTREME']
# make combined histogram for all error conditions
# plot the histogram
axs.hist(plot_data, range=(0,1),color=colors,label=plot_label, histtype='bar')
#axs.hist(plot_data,histtype='bar', color=colors, label=plot_label)
# Put a legend to the right of the current axis
axs.legend(loc='center left',bbox_to_anchor=(1, 0.5))
# set the axis labels
axs.set_xlabel(f"Similarity:")
axs.set_ylabel("Frequency")
# have the x-axis go increasing from left to right
axs.invert_xaxis()
# ensure proper layout
fig.tight_layout()
plt.xlim(0, 1)
plt.show()

enter image description here

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading