How do I get the html page of a site with validation? requests

Advertisements

There is such a check on the site:
Check

How can I get data from such a site?

>Solution :

As I already wrote in the comments, try using the selenium library, it imitates working with a browser.

Before starting, install selenium and webdriver_manager (for easier work with drivers)
pip install -U selenium webdriver-manager

Here is an example code that works for all sites (Chrome):

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager # automatic webdriver for Chrome browser (can change to your browser)
import time

URL = 'YOUR LINK'
headers = {
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36",
    "Accept": "text/html,application/xhtml+xml,application/xml; q=0.9,image/webp,image/apng,*/*;q=0.8"
}

# opening the page and get elements from the table
options = webdriver.ChromeOptions()
driver = webdriver.Chrome(options=options, executable_path=ChromeDriverManager().install())
driver.get(URL)
time.sleep(6) # falling asleep (6 sec) to accurately load the site

html = driver.page_source
print(html) # outputs html code

# save html to file
with open('saving.html', 'wb+') as f:
    f.write(str.encode(html))

driver.close

Leave a ReplyCancel reply

Exit mobile version

%%footer%%