Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How not capture a string with regex

i have this string

<div class"ewSvNa"><a class="ugP" href="link">Description</a><span data-testid=""><small>$</small><span>0,00</span></div>

and this regex /ewS.*?ugP\".*?f=\"(.*?)\">(.*?)<.*?<s.*?n>(.*?)</g. The result is:

Group 1 = 'link'
Group 2 = 'Description'
Group 3 = '0,00'

My question is: It`s possible have the result of Group 3 like ‘$0,00’?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Thank u guys =]]]]]

>Solution :

It’s recommend to not use regex to parse HTML – instead use a proper parser such as Beautiful Soup.

Then your code becomes:

from bs4 import BeautifulSoup

text = '<div class"ewSvNa"><a class="ugP" href="link">Description</a><span data-testid=""><small>$</small><span>0,00</span></div>'
soup = BeautifulSoup(text)
amount = soup.select_one('span[data-testid]').get_text()
# '$0,00'
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading