Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Find the minimum and maximum values in the columns of a pandas data frame

I did look for a question similar to mine, but I did not find any answers still.

I have a data frame like this:

          achaea    bacteria    plastids    mitochondrion   viruses
CTAG    -22.141701  -27.891441  -2.474725   0.262533    0.026349
GGCC    -13.403537  -21.490028  -0.403491   -0.271403   -0.243087
GATC    -20.933825  -14.761891  4.681494    -0.098965   0.088650
CATG    -8.490766   -9.910195   1.150736    -0.005730   0.508743
TAAG    -17.376165  -18.653078  -1.525354   -0.708633   -1.917676

And my doubt is: ‘How can I get the minimum and maximum values from the columns?"
I would like some like this:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Minimum values:

archaea     CTAG  -22.141701 
bacteria    CTAG  -27.891441   
plastids    CTAG  -2.474725
mitochondrion   TAAG -0.708633
viruses    TAAG    -1.917676

Maximum values:

archaea     CATG    -8.4907661 
bacteria    CATG    -9.910195   
plastids    GATC  4.681494
mitochondrion   CTAG 0.262533
viruses    CATG    0.508743

I have tried:

df.min()
achaea          -22.141701
bacteria        -27.891441
plastids         -4.654833
mitochondrion    -0.881587
viruses          -1.917676
dtype: float64

df['achaea'].idxmin()
'CTAG'

df.reset_index().min()
index                AAAA
achaea           -22.1417
bacteria         -27.8914
plastids         -4.65483
mitochondrion   -0.881587
viruses          -1.91768
dtype: object

Well, I tried this and it is very close:

for col, idx in zip(df.columns, df.index):
    print(df[col].min(), idx, col)

-22.141701229820306 CTAG archaea
-27.89144069672985 GGCC bacteria
-4.654832775512324 GATC plastids
-0.8815871622500514 CATG mitochondrion
-1.917675731085761 TAAG viruses

Any suggestion to improve the efficiency and the code would be appreciate! Thank you for your time and effort.
Paulo

>Solution :

You can use:

df.where(df.eq(df.min())).T.stack()
  • identify the column values equal to the min per column
  • mask the other values
  • stack to reshape while dropping the NaNs (transpose first to change the order or the index)

output:

achaea         CTAG   -22.141701
bacteria       CTAG   -27.891441
plastids       CTAG    -2.474725
mitochondrion  TAAG    -0.708633
viruses        TAAG    -1.917676
dtype: float64
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading