Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Regex pattern to get the number bofre the searched query and a flag with new lines?

I have this string:

-wordpagefound: 1 offerte 201135455 fam. gaudino, umbau wohnung, winterthur seite 3 von 17
rektifikat
projekt 7514505

pos.nr menge uberschrift artikelnummer richtpreis betrag
me bild artikelbeschreibung exkl. mwst exkl. mwst

dusche - wc eltern

How can I get the number right after -wordpagefound: if I search for "wc"?
I need to get the page where it is found including new lines (for OCR purpose).

I tried to do this preg_match_all('/(-wordpagefound).*([0-9]).*('.$searchText.')/mi', $file->text, $matches, PREG_OFFSET_CAPTURE) but apparently because of the new lines it doesn’t work.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Thank you in advance!

>Solution :

You can use

/-wordpagefound\D*(\d+).*?\bwc\b/si
/-wordpagefound\D*\K\d+(?=.*?\bwc\b)/si

See the regex demo / regex demo #2.

Details:

  • -wordpagefound – a fixed string
  • \D* – zero or more non-digits
  • (\d+) – Group 1: one or more digits
  • .*? – any zero or more chars as few as possible
  • \bwc\b – a whole word wc.

The second regex is a variation of the first regex where \K discards all text matched so far and the right regex part is enclosed into a positive lookahead to check for the pattern presence but exclude from match.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading