Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python selenium browser accepts cookies and returns`AttributeError`

I’m scraping data from a sports website. This is one of the pages, where the code gets stuck:

https://www.unitedrugby.com/clubs/dhl-stormers/kwenzo-blose

When opening it with an incognito browser you’ll see a Cookies window where one must accept the cookies. Here’s how I do it:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

driver.get(url)
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

WebDriverWait(driver, 8).until(
    EC.element_to_be_clickable((By.XPATH, "//button[@class='flex items-center justify-center py-2.5 px-[18px] lg:px-[22px] border border-solid rounded-tl-lg rounded-br-lg text-base lg:text-xl tracking-[2px] font-urc-sans transition ease-linear duration-300 border-turquoise-primary bg-turquoise-primary bg-opacity-[0.08] text-slate-deep hover:text-turquoise-secondary mx-auto md:order-last md:ml-4 md:mr-0']"))
)

WebDriverWait(driver, 8).until(
    EC.element_to_be_clickable((By.XPATH, "//button/span[contains(., 'ACCEPT ALL')]"))
).click()

el = driver.get_attribute("outerHTML")
el = BeautifulSoup(el, "html.parser")

... 

I’m not a pro with selenium, I’m still learning, so I can’t figure out what I’m doing wrong. It either gets stuck at the cookies window, or it crashes at the .get_attribute() call by saying:

AttributeError: 'WebDriver' object has no attribute 'get_attribute'

There are > 800 players which have a page in this website, and I have successfully scraped the data of many of them every week. They must have changed something in the website’s architecture but even today some of the scrapings of other players went fine, and then this one just doesn’t want to be scraped.

I hope someone can shed some light on the issue for me!

>Solution :

get_attribute – this method is related to Web Element and not Web Driver.

Change the below line:

el = driver.get_attribute("outerHTML")

To:

el = driver.find_element(By.XPATH, "//*")
el = el.get_attribute("outerHTML")
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading