Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Replace multiple occurences of character after zero-length assertion

I would like to replace every _ with a - on lines starting with #| label: using PCRE2 regex within my text editor.

Example:

#| label: my_chunk_label
my_function_name <- function(x)

Should become:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

#| label: my-chunk-label
my_function_name <- function(x)

In contrast to .NET regex, where one could substitute (?<=^#\| label: .+)_ with - (regex101 example), PCRE2 does not support infinite lookbehind so the regex is invalid. So far, the only way I found was to repeatedly substitute ^#[^_]+\K_ with - (regex101 example) but I was curious if there is a single-pass solution.

>Solution :

If you are using pcre, you could make use of \G and \K

Then in the replacement use -

(?:^#|\h+label:\h+|\G(?!^))[^\r\n_]*\K_

The pattern matches:

  • (?: Non capture group for the alternatives
    • ^#|\h+label:\h+ Match the pattern that should be at the start of the string, where \h matches a horizontal whitespace character
    • | Or
    • \G(?!^) Assert the current position at the end of the previous match, not at the start
  • ) Close the non capture group
  • [^\r\n_]* Match optional characters except for newlines or _
  • \K Forget what is matched so far
  • _ Match the underscore

Regex demo

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading