Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to split a string column into two columns with a 'variable' delimiter?

I have a rough df extracted from a PDF data table that is composed of a single column containing strings. Each entry in the column is in the following form:

Mayor ............... Paul Jones
Senator ................. Billy Twister
Congress Rep. .......... Chris Rock
Chief of Staff ....... Tony Allen

I would like to use string.split(pat=".") to split these strings into two columns for Title and Name using the period strings. However, each entry in the column has different lengths for the period string delimiter. As expected, if I use pat="." or pat=".......", for instance, I end up with inconsistent splits.

My goal is to implement a way for the delimiter to be able to handle a range of period string lengths for a given minimum and maximum length. I have googled around with this question and have not been able to find a direct answer. Is this possible to implement?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Use Series.str.split with the regex \s+\.+\s+, which splits by 1+ spaces, 1+ periods, 1+ spaces:

df = pd.DataFrame({'A': ['Mayor ............... Paul Jones', 'Senator ................. Billy Twister', 'Congress Rep. .......... Chris Rock', 'Chief of Staff ....... Tony Allen']})

df[['Title', 'Name']] = df['A'].str.split('\s+\.+\s+', expand=True)

#                                          A           Title           Name
# 0         Mayor ............... Paul Jones           Mayor     Paul Jones
# 1  Senator ................. Billy Twister         Senator  Billy Twister
# 2      Congress Rep. .......... Chris Rock   Congress Rep.     Chris Rock
# 3        Chief of Staff ....... Tony Allen  Chief of Staff     Tony Allen
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading