I have a string like following
19990101 - John DoeLorem ipsum dolor sit amet 19990102 - Elton Johnconsectetur adipiscing elit
How can I write a regex that would give me these two separate strings
19990101 - John DoeLorem ipsum dolor sit amet
19990102 - Elton Johnconsectetur adipiscing elit
The regex I wrote works up to this
/\d+ -/gm
But I don’t know how can I include the alphabets there as well
>Solution :
You can use
const text = '19990101 - John DoeLorem ipsum dolor sit amet 19990102 - Elton Johnconsectetur adipiscing elit';
console.log(text.match(/\d+\s+-[A-Za-z0-9\s]*[A-Za-z]/g))
console.log(text.split(/(?!^)\s+(?=\d+\s+-)/))
The text.match(/\d+\s+-[A-Za-z0-9\s]*[A-Za-z]/g) approach is extracting the alphanumeric/whitespace chars after \d+\s+- pattern. Details:
\d+– one or more digits\s+– one or more whitespaces-– a hyphen[A-Za-z0-9\s]*– zero or more alphanumeric or whitespace chars[A-Za-z]– a letter
The text.split(/(?!^)\s+(?=\d+\s+-)/) splitting approach breaks the string with one or more whitespaces before one or more digits + one or more whitespaces + -:
(?!^)– not at the start of string\s+– one or more whitespaces(?=\d+\s+-)– a positive lookahead that matches a location that is immediately followed with one or more digits + one or more whitespaces +-.