I have a dataframe that has a column which contains addresses. I would like to split the addresses so that the ending are in a column Ending and the strings before the the ending item are in a separate column Beginning. The address vary in length eg:
- Main Street
- Jon Smith Close
- The Rovers Avenue
After searching different resources I came up with the following
new_address_df['begining'], new_address_df['ending'] = new_address_df['street'].str.split().str[:-1].apply(lambda x: ' '.join(map(str, x))), new_address_df['street'].str.split().str[-1]
The code works but I am not sure if its the right way to write the code in python. Another option would have been to convert to list, modify the data in list form and then convert back to dataframe. I guess this might not be the best approach.
Is there a way to improve the above code if its not pythonic.
>Solution :
There are certainly alot of ways of doing this 🙂 I would go for using str and rpartition. rpartition splits your string in 3 components, the remaining part, the partition string, and the part after remaining and the partition string. If you just take the first and remaining part you should be done.
df[["begining", "ending"]]=df.street.str.rpartition(" ")[[0,2]]