I have a dataframe with a Date column consisting of stings in this format. I need to strip the end of the strin so that I can convert to a datetime object.
"20231101 05:00:00 America/New_York"
"20231101 06:00:00 America/New_York"
I have tried these approaches unsuccessfully.
df['Date'] = df['Date'].replace('^.*\]\s*', '', regex=True)
df['Date'] = df['Date'].str.strip(' America/New_York')
df['Date'] = df['Date'].map(lambda x: x.rstrip(' America/NewYork'))``
as well as a couple of others based on my searches. Is there an easy way to do this or should I write a function to slice the string by grabbing the first 17 characters and assigning the result back to the df. Note the string could be of the form ‘20231101 05:00:00 America/Central’
Thanks for any and all assistance.
>Solution :
- If you want to remove a particular suffix, then I recommend str.removesuffix rather than str.strip.
- Notice that you sometimes write
New_Yorkwith an underscore and sometimesNewYorkwithout an underscore. If you ask to remove'NewYork'then'New_York'won’t be removed. - After the edit in your question, the suffixes all start with
' America'but differ afterwards; in this case you could usestr.split(' America').str[0]to keep everything before' America'.
import pandas as pd
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Date': ["20231101 05:00:00 America/New_York",
"20231101 06:00:00 America/New_York",
"20231101 07:00:00 America/Central"]
})
# df['Date'] = df['Date'].str.removesuffix(' America/New_York')
df['Date'] = df['Date'].str.split(' America').str[0]
print(df)
# Name Date
# 0 Alice 20231101 05:00:00
# 1 Bob 20231101 06:00:00
# 2 Charlie 20231101 07:00:00