Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Matching an entire sentence containing words even if the sentence spans multiple lines

Attempting to match the entire sentence of a document containing certain words even if the sentence spans multiple lines.

My current attempts only capture the sentence if it does not span to the next lines.

^.*\b(dog|cat|bird)\b.*\.

Using ECMAScript.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

When no abbreviations in the input are expected use

[^?!.\s][^?!.]*?\b(dog|cat|bird)\b[^?!.]*[.?!]

See regex proof.

EXPLANATION

--------------------------------------------------------------------------------
  [^?!.\s]                 any character except: '?', '!', '.',
                           whitespace (\n, \r, \t, \f, and " ")
--------------------------------------------------------------------------------
  [^?!.]*?                 any character except: '?', '!', '.' (0 or
                           more times (matching the least amount
                           possible))
--------------------------------------------------------------------------------
  \b                       the boundary between a word char (\w) and
                           something that is not a word char
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    dog                      'dog'
--------------------------------------------------------------------------------
   |                        OR
--------------------------------------------------------------------------------
    cat                      'cat'
--------------------------------------------------------------------------------
   |                        OR
--------------------------------------------------------------------------------
    bird                     'bird'
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  \b                       the boundary between a word char (\w) and
                           something that is not a word char
--------------------------------------------------------------------------------
  [^?!.]*                  any character except: '?', '!', '.' (0 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  [.?!]                    any character of: '.', '?', '!'
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading