Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

split before first comma in pandas and output

I want to output the following table in pandas. I only have the description column so far but I want to split on the comma and output the contents before the comma in the commondescrip column.

I have the description column right now, I need the commondescrip column

description commondescrip
00001 00001
00002 00002
00003,Area01 00003
00004 00004
00005,Area02 00005

I tried

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

splitword = df2["description"].str.split(",", n=1, expand = True)
df2["commondescrip"] = splitword[0]

but it gives me NaN for those rows that have Area.

How can I fix it so that I can achieve the above the table and split it to output before the comma?

>Solution :

Don’t split, this would require to handle several parts while you’re only interested in one: remove or extract.

removing everything after the first comma:

df['commondescrip'] = df['description'].str.replace(',.*', '', regex=True)

or extracting everything before the first comma:

df['commondescrip'] = df['description'].str.extract('([^,]+)')

output:

    description commondescrip
0         00001         00001
1         00002         00002
2  00003,Area01         00003
3         00004         00004
4  00005,Area02         00005
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading