Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to split a filename based on the latest occurrence of repeated delimiters? (Python)

How can I split a filename based on the latest occurrence of a repeated delimiter? Such that:

Example File List:

abc_123
abc_123_d4
abc__123  (2 underscores)
abc_123__d4  (2 underscores)
abc____123  (4 underscores)

Expected Outcome:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

abc, 123
abc, 123, d4
abc_, 123 (1 underscore)
abc, 123_, d4 (1 underscore)
abc___, 123 (3 underscores)

Using:

filename.split("_")

would output:

abc, 123
abc, 123, d4
abc, 123
abc, 123, d4
abc, 123

>Solution :

Using re.split

import re

pattern = re.compile(r'_(?!_)')

pattern.split('abc_123')  # ['abc', '123']
pattern.split('abc_123_d4')  # ['abc', '123', 'd4']
pattern.split('abc__123')  # ['abc_', '123']
pattern.split('abc_123__d4')  # ['abc', '123_', 'd4']
pattern.split('abc____123')  # ['abc___', '123']

The regex _(?!_) matches an underscore that is not followed by another underscore

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading