I’m scraping data from a sports website. This is one of the pages, where the code gets stuck:
https://www.unitedrugby.com/clubs/dhl-stormers/kwenzo-blose
When opening it with an incognito browser you’ll see a Cookies window where one must accept the cookies. Here’s how I do it:
driver.get(url)
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
WebDriverWait(driver, 8).until(
EC.element_to_be_clickable((By.XPATH, "//button[@class='flex items-center justify-center py-2.5 px-[18px] lg:px-[22px] border border-solid rounded-tl-lg rounded-br-lg text-base lg:text-xl tracking-[2px] font-urc-sans transition ease-linear duration-300 border-turquoise-primary bg-turquoise-primary bg-opacity-[0.08] text-slate-deep hover:text-turquoise-secondary mx-auto md:order-last md:ml-4 md:mr-0']"))
)
WebDriverWait(driver, 8).until(
EC.element_to_be_clickable((By.XPATH, "//button/span[contains(., 'ACCEPT ALL')]"))
).click()
el = driver.get_attribute("outerHTML")
el = BeautifulSoup(el, "html.parser")
...
I’m not a pro with selenium, I’m still learning, so I can’t figure out what I’m doing wrong. It either gets stuck at the cookies window, or it crashes at the .get_attribute() call by saying:
AttributeError: 'WebDriver' object has no attribute 'get_attribute'
There are > 800 players which have a page in this website, and I have successfully scraped the data of many of them every week. They must have changed something in the website’s architecture but even today some of the scrapings of other players went fine, and then this one just doesn’t want to be scraped.
I hope someone can shed some light on the issue for me!
>Solution :
get_attribute – this method is related to Web Element and not Web Driver.
Change the below line:
el = driver.get_attribute("outerHTML")
To:
el = driver.find_element(By.XPATH, "//*")
el = el.get_attribute("outerHTML")