Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Find_all not not finding all clases

I wrote this code to find all firms links, but it finds only first two, then it stops. Any idea why and how can I change it?

import requests
from bs4 import BeautifulSoup

url = "https://www.gelbeseiten.de/branchen/rechtsanwalt/mannheim"
req = requests.get(url)
src = req.text
soup = BeautifulSoup(src, "lxml")
all_firmas = soup.find_all("article", class_="mod mod-Treffer")
for i in all_firmas:
    i_2 = i.next_element.next_element
    print(i_2.get("href"))
print("Category done!")

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Following your link, only two articles have the class "mod mod-Treffer". The other articles have the class "mod mod-Treffer mod-Treffer–kurz"

The following code also get the other articles using regex (import re).

all_firmas = soup.find_all("article", class_=re.compile("mod mod-Treffer.+"))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading