Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to group by with array in Python

I have a dataframe df_my that looks like this

       Rows   Seq    Alg          iMap_x
0      1000   1      Max(1,2)      12
1      1000   2      Min(4)        37
2      1000   3      Max(1,2)      28
3      1000   4      Max(1,2)      18
4      1000   5      Sum()         33
..
134    1000   135    Min(4)        04
135    1000   136    Sum()         11
136    1000   137    Max(1,2)      24

I want to have a new dataframe that group by Alg and have array of iMap_x

so it will look like this

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

       Alg           iMap_x
0      Max(1,2)      [12,28,18,..,24]
1      Min(4)        [37,..,04]
4      Sum()         [33,..,11]

I know that I can group by and sum or find the average

df_my[["Alg","iMap_x"]].groupby(by="Alg").sum()

but I do not know how to make an array !!

>Solution :

Try:

print(df[["Alg", "iMap_x"]].groupby("Alg").agg(list).reset_index())

Prints:

        Alg            iMap_x
0  Max(1,2)  [12, 28, 18, 24]
1    Min(4)           [37, 4]
2     Sum()          [33, 11]

DataFrame used:

Rows Seq Alg iMap_x
0 1000 1 Max(1,2) 12
1 1000 2 Min(4) 37
2 1000 3 Max(1,2) 28
3 1000 4 Max(1,2) 18
4 1000 5 Sum() 33
134 1000 135 Min(4) 4
135 1000 136 Sum() 11
136 1000 137 Max(1,2) 24
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading