Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Unexpected behaviour datetime.strptime parse when the format string lacks spaces

Trying to parse and validate a date and hour that has to have "yyyymmddhh" format. I want the function to raise an exception if the string does not conform the specified format, so I test two ill formed strings that hasn’t the hour part:

Test 1. Results as expected

>>> datetime.strptime("20230609", "%Y%m%d%H")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/user/miniconda3/lib/python3.10/_strptime.py", line 568, in _strptime_datetime
    tt, fraction, gmtoff_fraction = _strptime(data_string, format)
  File "/home/user/miniconda3/lib/python3.10/_strptime.py", line 349, in _strptime
    raise ValueError("time data %r does not match format %r" %
ValueError: time data '20230609' does not match format '%Y%m%d%H'

Test 2. Bug?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Only changing date from June 9th to June 10th:

>>> datetime.strptime("20230610", "%Y%m%d%H")
datetime.datetime(2023, 6, 1, 0, 0)

As I understand, %Y, %m, %d and %H expect zero padded fixed length numbers with a total of 10 chars, so the lack of spaces shoudn’t fool the parser. Am i mistaken?

Tested on python 3.7 and 3.10.

>Solution :

Note 9 in the documentation indicates the leading 0 is optional with strptime:

When used with the strptime() method, the leading zero is optional for formats %d, %m, %H, %I, %M, %S, %j, %U, %W, and %V. Format %y does require a leading zero.

So strptime takes advantage of the fact that it can consume 2023 with %Y, 06 with %m, but only 1, not 10, with %d, leaving 0 to match %H.

With 20230609, 0 is not a valid day or month, so there is no interpretation that allows %Y%m%d to consume fewer than 8 characters, leaving nothing for %H.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading