Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How do I get the html page of a site with validation? requests

There is such a check on the site:
Check

How can I get data from such a site?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

As I already wrote in the comments, try using the selenium library, it imitates working with a browser.

Before starting, install selenium and webdriver_manager (for easier work with drivers)
pip install -U selenium webdriver-manager

Here is an example code that works for all sites (Chrome):

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager # automatic webdriver for Chrome browser (can change to your browser)
import time

URL = 'YOUR LINK'
headers = {
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36",
    "Accept": "text/html,application/xhtml+xml,application/xml; q=0.9,image/webp,image/apng,*/*;q=0.8"
}

# opening the page and get elements from the table
options = webdriver.ChromeOptions()
driver = webdriver.Chrome(options=options, executable_path=ChromeDriverManager().install())
driver.get(URL)
time.sleep(6) # falling asleep (6 sec) to accurately load the site

html = driver.page_source
print(html) # outputs html code

# save html to file
with open('saving.html', 'wb+') as f:
    f.write(str.encode(html))

driver.close
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading