Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

remove list elements based on pattern match

I have some code that reads in many Excel files from a folder.

Sometimes, there is a file lock on one of the files that makes the locked file show up when doing the glob search. For example: "C:~$filename.xlsx"

The file doesn’t show up in the folder (even with ‘show hidden’ checked) and I’ve tried to end Excel going through the task manager, which isn’t running. The only way to get the ghost file to not show up is to reboot the machine.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

So I thought I would just eliminate that item from the list if a similar locked file shows up again.

The following code is not producing four elements. It produces five.

My pattern is "\~$" for this example.

Can someone point out the error in the regex pattern?

import re

folder = ['C:\Work\~$Counts.xlsx', '~$ad_;', 'dslkjf$dl;jf', '$lkajd~f', 'C:\Work\Counts.xlsx']

pattern = re.compile(r'\\~\$')

# get rid of any list items that contain "\~$"
filelist = [i for i in folder if not pattern.match(i)]

print(filelist)

Thanks for any help.

>Solution :

Right now you’re finding strings that are exactly \~$, not ...\~$.... What you need is:

pattern = re.compile(r'.+?\\~\$.+')

.+ means match as many characters as few times as possible until \~$ is found.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading