I’m trying to regex match any duplicate words (i.e. alphanumeric and can have dashes) in some yaml with a PCRE tool
I have found [1] a consecutive, duplicate regex matcher:
(?<=,|^)([^,]*)(,\1)+(?=,|$)
it will catch
hello-world,hello-world,goodbye-world,goodbye-world
but not the "hello-world"s in
hello-world,goodbye-world,goodbye-world,hello-world
Could someone help me try to build a regex pattern for the second case (or both cases)?
[1] – https://www.regular-expressions.info/duplicatelines.html
>Solution :
Put an optional ,.* between the capture group and the back-reference.
(?<=,|^)([^,]*)(?:,.*)?(,\1)(?=,|$)