Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Extract string from one column to new column – Pandas

I have a column that I need to extract and separate all the genres, then add those extract genres back into dataframe. I tried to implement str.extract() method but don’t get anywhere.

Column example:

|title||genres|
|-----||------|
|Cowboy Bebop||['Comedy', 'Dementia', 'Horror', 'Seinen']|

Ideal new column:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

|title||genres|
|-----||------|
|Cowboy Bebop||'Comedy'|
|CowBoy Bebop||'Dementia'|
|CowBoy Bebop||'Horror'|
|CowBoy Bebop||'Seinen'|

>Solution :

You need pandas.DataFrame.explode:

df = df.explode('genres').reset_index(drop=True)

Output:

>>> df
          title    genres
0  Cowboy Bebop    Comedy
1  Cowboy Bebop  Dementia
2  Cowboy Bebop    Horror
3  Cowboy Bebop    Seinen

Note that you might need to convert the values in the genres column to actual list, because it might just look like a list but actually be a string. If so, run this before the above:

import ast
df['genres'] = df['genres'].apply(ast.literal_eval)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading