I am trying to iterate through the items in python and remove the timestamp but keep the extension
for item in items:
print(item.split('_')[0])
Although this works but it deletes the extension as well. This how the string looks like dataset_2020-01-05.txt and this how i need it to be dataset.txt or dataset_2020-01-05.zip -> dataset.zip
I also tried this way
for item in items:
print(item.split('_')[0] + item.split('.')[-1])
but there are some files that doesn’t have timestamp and it appends .txt to those files as well, so i ended up having something like dataset.txt.txt
>Solution :
To remove, match the date expression using the re module, and remove from the items array.
import re
items = ["dataset_2020-01-05.txt", "dataset_2020-01-05.zip", "dataset.txt"]
for i, item in enumerate(items):
match = re.search(r'_\d{4}-\d{2}-\d{2}', item)
if(match):
items[i] = item.replace(match.group(), '')
print(items)
Output
['dataset.txt', 'dataset.zip', 'dataset.txt']