regex stop continuous match when reach specific symbol

I want to remove character other than letters and number between two symbol which are < and > with empty string. The string is <F=*A*B*C*>

 (?<=F=|\G(?!^))[A-Za-z1-9]*\K[^A-Za-z1-9]+

 //output:<F=ABC 

 (?:^<F=(?=.+>$)|\G(?!^))[A-Za-z1-9]*\K[^A-Za-z1-9]+
 
 //output:<F=ABC 

This regex pattern capture last closing tag too and removed it (<F=ABC). How to make it stop at specific symbol and avoid it from capture last closing tag.

When I add > in [^A-Za-z1-9], it can remove characters other than > symbol correctly.

(?<=F=|\G(?!^))[A-Za-z1-9]*\K[^A-Za-z1-9>]+

//output: <F=ABC>// desired result

what is correct way to define stop matching start from this symbol? Thank you.

>Solution :

You can use

(?:\G(?!^)|<F=)[^<>]*?\K[^A-Za-z0-9<>]+(?=[^<>]*>)

See the regex demo.

Details:

  • (?:\G(?!^)|<F=) – either the end of the previous match or <F= text
  • [^<>]*? – any zero or more chars other than < and >, as few as possible
  • \K – match reset operator that discards the text matched so far from the overall match memory buffer
  • [^A-Za-z0-9<>]+ – one or more chars other than ASCII letters/digits and < and > chars
  • (?=[^<>]*>) – immiediately on the right, there must be zero or more chars other than < and > and then a > char.

Leave a Reply