Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Extract multiple pattern from string pandas

Create new column with the extracted middle and last strings from a column within a dataset.

Data

Status             ID
Ok                 hello_dd           
Ok                 hello_aa_now       
No                 standard_cc        
no                 standard_ee_not  

Desired

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Status             ID                        type
Ok                 hello_dd                  dd     
Ok                 hello_aa_now              aa
No                 standard_cc               cc
no                 standard_ee_not           ee

Doing

I am able to extract the last string, however, still researching how to extract the middle string.

df['type'] = df['ID'].str.strip('_').str[-1]

Any suggestion is appreciated.

>Solution :

Assuming you want to extract the string after the first _:

df['type'] = df['ID'].str.extract(r'_([^_]+)')

With split:

df['type'] = df['ID'].str.split('_').str[1]

output:

  Status               ID type
0     Ok         hello_dd   dd
1     Ok     hello_aa_now   aa
2     No      standard_cc   cc
3     no  standard_ee_not   ee
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading