Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Can find match with regex

Hi i’m trying to find line start with "CGK / WIII" but just can find the the first line?
what’s wrong with my text? (it is rendered from a pdf file)

Mytext

im coding with python to extract data from pdf invoice to dataframe with invoice2data package.
and face an error with one text rendered from one pdf file.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

first i tried with regex: "\w{3}\s/[\s\w{4}]*" and found out that it just can find 1 line. Then i also tried with fix text "CGK / WIII" should found 4 match. But it’s NOT.

i think there are font differences in my text but not sure.

>Solution :

When I turn on global - Don't return after the first match in your linked example, it shows 4 matches.

Also you can not use quantifiers {4} inside a character set (inside []).

I’d do it like this:
\w{3}\s/\s\w{4}

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading