Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python remove digits in the middle of the string

I am trying to iterate through the items in python and remove the timestamp but keep the extension

for item in items:
    print(item.split('_')[0])

Although this works but it deletes the extension as well. This how the string looks like dataset_2020-01-05.txt and this how i need it to be dataset.txt or dataset_2020-01-05.zip -> dataset.zip

I also tried this way

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

for item in items:
        print(item.split('_')[0] + item.split('.')[-1])

but there are some files that doesn’t have timestamp and it appends .txt to those files as well, so i ended up having something like dataset.txt.txt

>Solution :

To remove, match the date expression using the re module, and remove from the items array.

import re
items = ["dataset_2020-01-05.txt", "dataset_2020-01-05.zip", "dataset.txt"]
for i, item in enumerate(items):
    match = re.search(r'_\d{4}-\d{2}-\d{2}', item)
    if(match):
        items[i] = item.replace(match.group(), '')
print(items)

Output

['dataset.txt', 'dataset.zip', 'dataset.txt']
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading