This feels like such a simple request, but I cannot figure out what is going on here, and have been messing with different RegEx testers for a while now.
My RegEx pattern: \b(?=<GTOL-[A-Z]*>)
If it matters, the command I am calling in my code (C#): Regex.Split(Text, @"\b(?=<GTOL-[A-Z]*>)").ToList();
The string that will successfully be split: <GTOL-POSI>.010<MOD-MMC>B-C<GTOL-POSI>.002<MOD-FMC>; Returns: <GTOL-POSI>.010<MOD-MMC>B-C ; <GTOL-POSI>.002<MOD-FMC>
The string that will not split as expected: <GTOL-POSI><MOD-DIAM>.004<MOD-MMC>HC<MOD-MMC><GTOL-POSI>.030<MOD-MMC>D-E<MOD-FMC>; Should return (but doesn’t): <GTOL-POSI><MOD-DIAM>.004<MOD-MMC>HC<MOD-MMC> ; <GTOL-POSI>.030<MOD-MMC>D-E<MOD-FMC>
Another string that will not split as expected: <GTOL-POSI><MOD-DIAM>.005<MOD-MMC>AD-E<MOD-MMC><GTOL-PERP>.001A; Should return (but doesn’t): <GTOL-POSI><MOD-DIAM>.005<MOD-MMC>AD-E<MOD-MMC> ; <GTOL-PERP>.001A
>Solution :
There is not word boundary between the >< and as there seem to be at least 1 or more uppercase characters, then quantifier can be + to match 1 or more times.
If you don’t want to split at the start of the string, creating an empty entry in the result list, you can assert using a negative lookbehind (?<!^\s*) that there are not optional whitespace chars to the left after the start of the string.
(?<!^\s*)(?=<GTOL-[A-Z]*>)
See a regex .NET demo.