Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas str.extract all words before a certain word

I simply want to extract all words before a certain word from a pandas df column. For example if I have a df column:

County
Salt Lake County
San Juan County
Dover County

I want to get:

Salt Lake
San Juan
Dover

I have tried:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df['new_county'] = df['County'].str.lower().str.extract(r'\w+(?=\s+county)')

But this is only extracting one word right before the "County" and I couldn’t figure it out how to get all words. All the other questions on SO are a lot more complicated. Please help.

>Solution :

I might actually express your problem as just stripping off County from the end of the country names:

df["new_county"] = df["County"].str.replace(r'\s+County$', '')

Note that this approach is also robust regarding country names that might not end in County. In those cases, the above replacement would not alter the current text for the county.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading