Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Dataframe sum columns according to dictionnary of list of columns

I am trying to find the most pytonic way to do the following :

I have a dataframe and a dictionary

import pandas as pd
df = pd.DataFrame([[4,8,52,7,54],[0,20,2,21,35],[2,33,12,1,87]], columns = ['A', 'B', 'C', 'D', 'E'])
dic = {'x':['A','D'], 'y' : ['E'], 'z':['B','C']}

I am trying to have a dataframe with the keys of the dictionary as columns and each columns would be the sum of the columns from the original dataframe in the list of the corresponding key. For example, column ‘x’ would be the row-wise sum of the column ‘A’ and ‘D’.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

This is easily doable with some loops but I am wandering if there is any elegant solution similar to list comprehension for example.

>Solution :

Invert your dictionary, rename the columns and groupby.sum:

d2 = {v:k for k,l in dic.items() for v in l}
out = df.rename(columns=d2).groupby(level=0, axis=1).sum()

Or, for recent pandas versions (in which groupby with axis=1 will be deprecated):

d2 = {v:k for k,l in dic.items() for v in l}
out = df.T.rename(d2).groupby(level=0).sum().T

Output:

    x   y   z
0  11  54  60
1  21  35  22
2   3  87  45
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading