Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

regex finding number after first ocurance of substring

I have a sentence:

"Fourth-quarter 2021 net earnings per share (EPS) of $1.26, compared with 2020 EPS of $1.01; Fourth-quarter 2021 adjusted EPS of $1.11, down 25.5 percent compared with 2020 adjusted EPS of $1.49"

and would like to get number $1.11 after the first substring "adjusted EPS".

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

The best regex formula I could come with is:

re.search("^.*Adjusted EPS.*?(\$\d+.\d+).*", text,re.IGNORECASE).group(1)

but this gives me number $1.49 after second occurrence of "adjusted EPS".

How can I modify the search so I get the number $1.11?

>Solution :

The problem here is greedy regex which you use just in the beginning:

^.*Adj ...

^ means the start of the string. Being greedy, .* "eats" as much characters as possible up until the last "adjusted EPS"

There’re two solutions here, either make it non-greedy (i.e. lazy) ^.*?Adj ..., or remove ^.* completely – I see no use of it here

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading