I have a column, which represents date, however is formatted as string. I can’t use simple pandas.to_datetime like:
01/02/2023
Apr 02, 2016
Jun 2021
2023/12/01
I’ve tried to create a below formula, where I would list of potential date formats and using for loop to convert the column into desired format. However, it is obviously wrong as the column contains NONEs after applying it. Could you advise me better direction or what should I change, please?
def DateFormat(data):
for fmt in ('%b %d, %Y', '%d/%m/%Y', '%b %Y', '%Y/%b/%d'):
try:
pd.to_datetime(data['date'], format=fmt)
except ValueError:
pass
data['date'] = data.apply(DateFormat, axis = 1)
BEFORE APPLY DateFormat:
| ID | Date |
| — | ————– |
| 1 | 01/02/2023 |
| 2 | Apr 02, 2016 |
| 3 | Jun 2021 |
| 4 | 2023/12/01 |
AFTER APPLY DateFormat:
| ID | Date |
| — | —– |
| 1 | None |
| 2 | None |
| 3 | None |
| 4 | None |
>Solution :
If you don’t use return to return value then it uses return None at the end of function.
You should use return pd.to_datetime(...)
And if you want to return original value when it can’t convert it then you need return at the end. OR you could use return to return some default value.
def DateFormat(data):
for fmt in ('%b %d, %Y', '%d/%m/%Y', '%b %Y', '%Y/%b/%d'):
try:
return pd.to_datetime(data['date'], format=fmt)
except ValueError:
pass
# return original `date` if it couldn't convert
return data['date']
# or return some default value
#return datetime.datetime(1900, 1, 1)