Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How do I search for an attribute using BeautifulSoup?

I am trying to scrape a that contains the following HTML.


<div class="FeedCard urn:publicid:ap.org:db2b278b7e4f9fea9a2df48b8508ed14 Component-wireStory-0-2-116 card-0-2-117" data-key="feed-card-wire-story-with-image" data-tb-region-item="true">
      
<div class="FeedCard urn:publicid:ap.org:2f23aa3df0f2f6916ad458785dd52c59 Component-wireStory-0-2-116 card-0-2-117" data-key="feed-card-wire-story-with-image" data-tb-region-item="true">
      

As you can see, "FeedCard " is something they have in common. Therefore, I am trying to use a regular expression in conjunction with BeautifulSoup. Here is the code I’ve tried.

pattern = r"\AFeedCard"



for card in soup.find('div', 'class'==re.compile(pattern)):
    print(card)
    print('**********')

I’m expecting it to give me each on of the divs from above, with the asterisks separating them. Instead it is giving me the entire HTML of the page in a single instance

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Thank you,

>Solution :

No need to use regular expression here. Just use CSS selector or BS4 Api:

from bs4 import BeautifulSoup


html = """\
<div class="FeedCard urn:publicid:ap.org:db2b278b7e4f9fea9a2df48b8508ed14 Component-wireStory-0-2-116 card-0-2-117" data-key="feed-card-wire-story-with-image" data-tb-region-item="true">
Item 1
</div>

<div class="FeedCard urn:publicid:ap.org:2f23aa3df0f2f6916ad458785dd52c59 Component-wireStory-0-2-116 card-0-2-117" data-key="feed-card-wire-story-with-image" data-tb-region-item="true">
Item 2
</div>
"""

soup = BeautifulSoup(html, "html.parser")

for card in soup.select(".FeedCard"):
    print(card.text.strip())

Prints:

Item 1
Item 2
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading