Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Conditional operators in Beautiful Soup findAll by attribute value

I want to find all of tds that don’t have a custom html attribute data-stat="randomValue"
My data looks something like this:

<td data-stat="foo">10</td>
<td data-stat="bar">20</td>
<td data-stat="test">30</td>
<td data-stat="DUMMY"> </td>

I know that I can just select for foo, bar, and test but my actual dataset will have hunders of different values for data-set so it just wouldn’t be feasible to code.

Is there something like a != operator that I can use in beautiful soup? I tried doing:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

[td.getText() for td in rows[i].findAll('td:not([data-stat="DUMMY"])')]

but I only get [] as a value.

>Solution :

You can use list comprehension to filter out the unvanted tags, for example:

print([td.text for td in soup.find_all("td") if td.get("data-stat") != "DUMMY"])

Or use CSS selector with .select (as @Barmar said in comments, .find_all doesn’t accept CSS selectors):

print([td.text for td in soup.select('td:not([data-stat="DUMMY"])')])
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading