Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Create a df new column which includes a list

I’m working on a multi-label image classifaction task. I have a dataframe with two columns (id and labels). I want to create a new column, which checks the ids for duplicates and if there is a duplicate (which is the case) the additional label should be assigned to the new column. The result should be a new column including all labels. Im struggling to write the labels in a new column as a list. Does anyone can support me here?

My df has the following structures:

| id       | labels         |
| -------- | -------------- |
| x.jpg    | label_1        |
| x.jpg    | label_2        |

New dataframe

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

| id       | labels         | all_labels       |
| -------- | -------------- |-------------------
| x.jpg    | label_1        | [label_1, label_2, and other if existent]
| x.jpg    | label_2        |

>Solution :

I think this does what you want although the format is a bit different:

newdf = df.groupby('id')['labels'].agg(list).reset_index(name='labels')

produces

      id              labels
0  x.jpg  [label_1, label_2]
1  y.jpg           [label_3]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading