Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Splitting strings containing newline command and outputting to two columns

A minimal example of my data looks as follows:

import pandas as pd

data = {'address': ['STREET ADDRESS', 'C/O NAME LASTNAME\nOTHER STREET ADDRESS'], 'coaddress':['', '']}

df = pd.DataFrame(data)

df

I am looking for a way (using pandas, preferably) to:

  1. identify rows in which the address column contains "C/O", and
  2. split the string at the newline (\n) command and output the part of the string before the newline command to the corresponding row in the coaddress column and keep the part of the string after the newline command in the address column.

The df I want to achieve looks as follows:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

desired_data = {'address': ['STREET ADDRESS', 'OTHER STREET ADDRESS'], 'coaddress':['', 'C/O NAME LASTNAME']}

desired_df = pd.DataFrame(desired_data)

desired_df

Any suggestions on how to achieve this? Thanks!

>Solution :

We can do it in full Pandas using loc, contains and split like so :

df.loc[df["address"].str.contains('C/O'), 'coaddress'] = df["address"].str.split('\n').str[0]
df.loc[df["address"].str.contains('C/O'), 'address'] = df["address"].str.split('\n').str[1]

Output :

    address                 coaddress
0   STREET ADDRESS  
1   OTHER STREET ADDRESS    C/O NAME LASTNAME
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading