Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to select all occurrences between a positive lookbehind and lookahead?

I want to learn how to capture all occurrences of a character (e.g., -) between (?<=...) and (?=...).

Suppose I have the following text:

- [abc!word1-word2-word3]
- word1-word2-word3

I aim to create a single capture group containing all - only if the string starts with [abc! and ends with ].

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I tried the following (e.g., see demo here):

(?<=\[abc!)
  .* (-) .*
(?=\])

However, only the last occurrence of - is matched as shown below.

enter image description here

Is there a way to achieve this? For clarity, I am using the PCRE2 flavour with the gmx options.

>Solution :

You can replace each individual hyphen betwee [abc! and the next (closest) ] char using

(?:\G(?!\A)|\[abc!)[^][-]*\K-(?=[^][]*])

See the regex demo.

Details:

  • (?:\G(?!\A)|\[abc!) – either the end of the previous successful match (\G(?!\A), see this \G reference) or (|) [abc! string (\[abc!)
  • [^][-]* – zero or more chars other than [, ] and -
  • \K – a match reset operator that discards the text matched so far from the match memory buffer
  • - – a hyphen
  • (?=[^][]*]) – a positive lookahead that makes sure there are zero or more chars other than square brackets followed with a ] char immediately to the right of the current location.
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading