Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas: find column min and max ranges by group

I have a data frame like so:

df = pd.DataFrame({
    'A':np.random.random(20),
    'B':np.random.random(20),
    'Gen': [1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4]
})

Is there a simple way with pandas to find the range (min and max values) in each column by group ‘Gen’? In real life, I can have lots of columns A, B, … (>20). The result could look something like:

Var   Gen    Min   Max
A     1      x     x
B     1      x     x
A     2      x     x
...

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You can melt to reshape, then groupby+agg to aggregate per group:

(df.melt(id_vars='Gen', var_name='Var')
   .groupby(['Var', 'Gen'])['value']
   .agg(['min', 'max'])
   #.reset_index()  # uncomment to have Gen/Var as columns
)

Alternatively, aggregate first, then stack:

df.groupby('Gen').agg(['min', 'max']).stack(level=0)

output:

              min       max
Var Gen                    
A   1    0.009241  0.813241
    2    0.142218  0.751106
    3    0.005829  0.991914
    4    0.033108  0.579354
B   1    0.241094  0.990492
    2    0.030335  0.848735
    3    0.209076  0.785530
    4    0.277970  0.962307
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading