I have a list of files that are arranged in the following format:
'folder/sensor_01/2021/12/31/005_6_0.csv.gz',
'folder/sensor_01/2022/01/01/005_0_0.csv.gz',
'folder/sensor_01/2022/01/02/005_1_0.csv.gz',
'folder/sensor_01/2022/01/03/005_4_0.csv.gz',
....
Now, what I want to do is filter the entries which are within the time range. So, in the folder listings, the middle segment after sensor_01
and before 005
give the time entry (till date resolution).
I am getting stuck with how to extract this time segment from the folder path and convert it to a python DateTime object. I think I can then use the comparison operators to filter the entries.
>Solution :
The answer is the string to DateTime formatting.
Split
You can split the text to get the Year, Month, and Day part.
file = 'folder/sensor_01/2021/12/31/005_6_0.csv.gz'
file.split("/")
# ['folder', 'sensor_01', '2021', '12', '31', '005_6_0.csv.gz']
Here 2nd, 3rd and 4th elements are year, month and day.
Or
strptime
See https://stackoverflow.com/a/466376/2681662. You can create a DateTime object from a string. But there’s no restriction of delimiters for the Year, Month, and Day separator.
So:
file = 'folder/sensor_01/2021/12/31/005_6_0.csv.gz'
datetime.strptime(file, 'folder/sensor_01/%Y/%m/%d/005_6_0.csv.gz') # This is valid
# datetime.datetime(2021, 12, 31, 0, 0)