Having an unknown problem with Regex processing names

Advertisements

I’m parsing name strings that have strange compound formations.
The current formation that’s giving me a problem is these names:

Edward St. Loe Livermore
Henry St. George Tucker III
Henry St. John

This pattern (.*)(St\.\s\w+)\s(.*) parses the first two names and completely ignores the third.

This pattern (.*)(St\.\s\w+)|(St\.\s\w+\s(.*))$ returns the third name as well, but leaves off the surname of the first two.

I’m using this save https://regex101.com/ to test the regex pattern

So far I can’t figure out what pattern will return the surname in the match for all three names,
or if I need to do conditional statement in my code to parse the three element names separately, which seems inefficient.

TIA

>Solution :

Use this regex:

(.*)(St\.\s\w+)\s*.*

Online Demo

The regular expression matches as follows:

Node Explanation
( group and capture to \1:
.* any character except \n (0 or more times (matching the most amount possible))
) end of \1
( group and capture to \2:
St ‘St’
\. .
\s whitespace (\n, \r, \t, \f, and " ")
\w+ word characters (a-z, A-Z, 0-9, _) (1 or more times (matching the most amount possible))
) end of \2
\s* whitespace (\n, \r, \t, \f, and " ") (0 or more times (matching the most amount possible))
.* any character except \n (0 or more times (matching the most amount possible))

Leave a ReplyCancel reply