Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Getting binary labels on from a dataframe and a list of labels

Suppose I have the following list of labels,

labs = ['G1','G2','G3','G4','G5','G6','G7']

and also suppose that I have the following df:

   group entity_label
0      0           G1
1      0           G2
3      1           G5
4      1           G1
5      2           G1
6      2           G2
7      2           G3

to produce the above df you can use:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df_test = pd.DataFrame({'group': [0,0,0,1,1,2,2,2,2],
                       'entity_label':['G1','G2','G2','G5','G1','G1','G2','G3','G3']})

df_test.drop_duplicates(subset=['group','entity_label'], keep='first')

for each group I want to use a mapping to look up on the labels and make a new dataframe with binary labels

   group    entity_label_binary
0      0  [1, 1, 0, 0, 0, 0, 0]
1      1  [1, 0, 0, 0, 1, 0, 0]
2      2  [1, 1, 1, 0, 0, 0, 0]

namely for group 0 we have G1 and G2 hence 1s in above table and so on. I wonder how one can do this?

>Solution :

One option, based on crosstab:

labs = ['G1','G2','G3','G4','G5','G6','G7']

(pd.crosstab(df_test['group'], df_test['entity_label'])
   .clip(upper=1)
   .reindex(columns=labs, fill_value=0)
   .agg(list, axis=1)
   .reset_index(name='entity_label_binary')
)

Variant, with get_dummies and groupby.max:

(pd.get_dummies(df_test['entity_label'])
   .groupby(df_test['group']).max()
   .reindex(columns=labs, fill_value=0)
   .agg(list, axis=1)
   .reset_index(name='entity_label_binary')
)

Output:

   group    entity_label_binary
0      0  [1, 1, 0, 0, 0, 0, 0]
1      1  [1, 0, 0, 0, 1, 0, 0]
2      2  [1, 1, 1, 0, 0, 0, 0]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading