Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python Selenium – how to get all urls on a page that only load the link after clicking on the div?

I’m trying to scrap the results from this page https://www.zapimoveis.com.br/aluguel/apartamentos/sp+sao-paulo+zona-sul+itaim-bibi/ using Selenium, but I got stuck on obtaining the url of each result. It seems safe to say that each card’s url is not stored on a <a> element and apparently not stored at all at any point of the inner html of each div.

The only way to obtain the address is by clicking on the div, which opens a new tab.
Currently, I’m using selenium to click on each one, copying the address and then closing the tab, but not only this is a much more complex and time consuming process but also could trigger some captcha by doing that many requests to the website.

Is there a way to obtain the urls of all offers on this page without this clicking process? I tried using the inspect tool on chrome but couldn’t figure out what is the js or wtv resposible for this behavior.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Thanks!

>Solution :

I checked out the site and it looks like each card-container has a data-id that can be used to access the listing.
The link for this card:

<div data-id="2593637292" class="card-container js-listing-card">{THE HTML FOR THAT CARD}</div>

would be https://www.zapimoveis.com.br/imovel/2593637292.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading