Create new column with the extracted middle and last strings from a column within a dataset.
Data
Status ID
Ok hello_dd
Ok hello_aa_now
No standard_cc
no standard_ee_not
Desired
Status ID type
Ok hello_dd dd
Ok hello_aa_now aa
No standard_cc cc
no standard_ee_not ee
Doing
I am able to extract the last string, however, still researching how to extract the middle string.
df['type'] = df['ID'].str.strip('_').str[-1]
Any suggestion is appreciated.
>Solution :
Assuming you want to extract the string after the first _:
df['type'] = df['ID'].str.extract(r'_([^_]+)')
With split:
df['type'] = df['ID'].str.split('_').str[1]
output:
Status ID type
0 Ok hello_dd dd
1 Ok hello_aa_now aa
2 No standard_cc cc
3 no standard_ee_not ee