Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pythonic way to create dataset for multilabel text classification

I have a text dataset that looks like this.

import pandas as pd
df = pd.DataFrame({'Sentence': ['Hello World',
                                'The quick brown fox jumps over the lazy dog.',
                                'Just some text to make third sentence!'
                               ],
                   'label': ['greetings',
                             'dog,fox',
                             'some_class,someother_class'
                            ]})

enter image description here

I want to transform this data into something like this.
This is how dataframe should look like after transformation.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Is there a pythonic way to make this transformation for multilabel classification?

>Solution :

You can use pandas.Series.explode to explode the label column then cross it with the sentences column by using pandas.crosstab.

Try this :

def cross_labels(df):
    return pd.crosstab(df["Sentence"], df["label"])

out = (
        df.assign(label= df["label"].str.split(","))
          .explode("label")
          .pipe(cross_labels)
          .rename_axis(None, axis=1)
          .reset_index()
      )

# Output :

print(out)

                                       Sentence  dog  fox  greetings  some_class  someother_class
0                                   Hello World    0    0          1           0                0
1        Just some text to make third sentence!    0    0          0           1                1
2  The quick brown fox jumps over the lazy dog.    1    1          0           0                0
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading