Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Replace combination of space, hyphen and text or a "by" using regex and pandas

I want to replace a combination of a space, an hyphen, a space and text or the combination "By [Author]". This is my data frame:

my_titles = ['Peter Rabbit - Volume II', 'Who stole my cookie  By Cole Pattesh', 'The Stormy Night -  Nia Costas']
adf = pd.DataFrame({'my_titles':my_titles})
adf
    my_titles
0   Peter Rabbit - Volume II
1   Who stole my cookie By Cole Pattesh
2   The Stormy Night - Nia Costas

My expected df is:

    my_titles
0   Peter Rabbit
1   Who stole my cookie
2   The Stormy Night

I have tried this, expecting regex to recognize the ‘\s’ space and the ‘|’ (or):

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

adf['my_titles'].replace('\s-\s*|\sBy\s*$','',regex=True)
adf

And I tried this too trying to chain the space and words:

adf['my_titles'].replace('[ - \w]|[ By \w]','',regex=True)
adf

Please, do you know what I am doing wrong?

>Solution :

You can use

import pandas as pd
my_titles = ['Peter Rabbit - Volume II', 'Who stole my cookie  By Cole Pattesh', 'The Stormy Night -  Nia Costas']
adf = pd.DataFrame({'my_titles':my_titles})
adf['my_titles'] = adf['my_titles'].str.replace(r'\s+(?:-\s+|By\s+[A-Z]).*', '', regex=True)

Ouput of print(adf['my_titles']):

0           Peter Rabbit
1    Who stole my cookie
2       The Stormy Night

See the regex demo. Details:

  • \s+ – one or more whitespaces
  • (?:-\s+|By\s+[A-Z]) – a - and one or more whitespaces, or By, one or more whitespaces, and an uppercase letter
  • .* – the rest of the line.
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading