Regex to extract each alphanumeric pattern

I have a string like following

19990101 - John DoeLorem ipsum dolor sit amet 19990102 - Elton Johnconsectetur adipiscing elit

How can I write a regex that would give me these two separate strings

19990101 - John DoeLorem ipsum dolor sit amet

19990102 - Elton Johnconsectetur adipiscing elit

The regex I wrote works up to this

/\d+ -/gm

Image

But I don’t know how can I include the alphabets there as well

Image2

>Solution :

You can use

const text = '19990101 - John DoeLorem ipsum dolor sit amet 19990102 - Elton Johnconsectetur adipiscing elit';
console.log(text.match(/\d+\s+-[A-Za-z0-9\s]*[A-Za-z]/g))
console.log(text.split(/(?!^)\s+(?=\d+\s+-)/))

The text.match(/\d+\s+-[A-Za-z0-9\s]*[A-Za-z]/g) approach is extracting the alphanumeric/whitespace chars after \d+\s+- pattern. Details:

  • \d+ – one or more digits
  • \s+ – one or more whitespaces
  • - – a hyphen
  • [A-Za-z0-9\s]* – zero or more alphanumeric or whitespace chars
  • [A-Za-z] – a letter

The text.split(/(?!^)\s+(?=\d+\s+-)/) splitting approach breaks the string with one or more whitespaces before one or more digits + one or more whitespaces + -:

  • (?!^) – not at the start of string
  • \s+ – one or more whitespaces
  • (?=\d+\s+-) – a positive lookahead that matches a location that is immediately followed with one or more digits + one or more whitespaces + -.

Leave a Reply