I’m trying to search for colons in a given string so as to split the string at the colon for preprocessing based on the following conditions
- Preceeded or followed by a word e.g
A Book: Chapter 1orA Book :Chapter 1 - Do not match if it is part of emoticons i.e
:( or ): or :/ or :-)etc - Do not match if it is part of a given time i.e
16:00etc
I’ve come up with a regex as such
(\:)(?=\w)|(?<=\w)(\:)
which satisfies conditions 2 & 3 but still fails on condition 3 as it matches the colon present in the string representation of time. How do I fix this?
edit: it has to be in a single regex statement if possible
>Solution :
You can use
(:\b|\b:)(?!(?:(?<=\b\d:)|(?<=\b\d{2}:))\d{1,2}\b)
See the regex demo. Details:
(:\b|\b:)– Group 1: a:that is either preceded or followed with a word char(?!(?:(?<=\b\d:)|(?<=\b\d{2}:))\d{1,2}\b)– there should be no one or two digits right after:(followed with a word boundary) if the:is preceded with a single or two digits (preceded with a word boundary).
Note :\b is equal to :(?=\w) and \b: is equal to (?<=\w):.
If you need to get the same capturing groups as in your original pattern, replace (:\b|\b:) with (?:(:)\b|\b(:)).