Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to groupby multiple columns in pandas based on name?

I want to create a new dataframe with columns calculated as means of columns with similar names from this dataframe:

            B6_i     B6_ii    B6_iii  ...  BXD80_i   BXD80_ii   BXD81_i
data                                  ...                              
Cd38    0.598864 -0.225322  0.306926  ... -0.312190  0.281429  0.424752
Trim21  1.947399  2.920681  2.805861  ...  1.469634  2.103585  0.827487
Kpnb1  -0.458240 -0.417507 -0.441522  ... -0.314313 -0.153509 -0.095863
Six1    1.055255  0.868148  1.012298  ...  0.142565  0.264753  0.807692

The new dataframe should look like this:

               B6     BXD80  ...    BXD81
data                                 
Cd38    -0.041416 -0.087859  ...  0.424752
Trim21  15.958981  3.091500  ...  0.827487
Kpnb1   -0.084471  0.048250  ... -0.095863
Six1     0.927383  0.037745  ...  0.807692

(like (B6_i + B6_ii + B6_iii)/3), based on all characters until the underscore "_")

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Some columns are one of n columns, and others are singular (like ‘BXD81_i’), so I need a method that can work with a varying number for each mean calculation.

Sorry for writing so long, many thanks in advance ! 🙂

>Solution :

You can aggregate mean per columns by values before _:

df.columns = df.columns.str.split('_', expand=True)

df1 = df.groupby(level=0, axis=1).mean()

Or:

df1 = df.groupby(lambda x: x.split('_')[0], axis=1).mean()
print (df1)
              B6     BXD80     BXD81
data                                
Cd38    0.226823 -0.015381  0.424752
Trim21  2.557980  1.786609  0.827487
Kpnb1  -0.439090 -0.233911 -0.095863
Six1    0.978567  0.203659  0.807692
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading