I’m running into an error trying to scrape data from a website with python and selenium. The website has plenty of time to load, and I can see the element I want to grab via the webdriver browser’s Inspect. But find_element still fails.
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait #waiting properly for things to load? [unused]
from selenium.webdriver.common.by import By #Finding particular things
import time
sURL = 'https://theanalyst.com/eu/2023/07/fifa-womens-world-cup-stats-2023-opta/'
myChromeOptions = webdriver.ChromeOptions()
#myChromeOptions.add_argument("--headless") #Would prefer this to be headless later, but need the debugging visibility now.
driver = webdriver.Chrome(options = myChromeOptions)
driver.get(sURL)
time.sleep(100) #Wait a sufficiently long time that I see the data I want in the browser that opens.
# At this point, if I Inspect inside the webdriver chrome browser, I see the data I'm looking for, namely a <div class="predictions-container">
element = driver.find_element(By.CLASS_NAME, "predictions-container") #Throws selenium.common.exceptions.NoSuchElementException
EDIT : Thanks Shawn. This solved everything. For anybody from the future, this is the corrected code.
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait #waiting properly for things to load
from selenium.webdriver.support import expected_conditions as EC # For waiting a proper amount of time for a specific thing.
from selenium.webdriver.common.by import By # Finding particular things
sURL = 'https://theanalyst.com/eu/2023/07/fifa-womens-world-cup-stats-2023-opta/'
myChromeOptions = webdriver.ChromeOptions()
myChromeOptions.add_argument("--headless") # This works with headless!
driver = webdriver.Chrome(options = myChromeOptions)
driver.get(sURL)
driver.switch_to.frame(driver.find_element(By.ID, "iFrameResizer0")) #From Shawn's answer
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.CLASS_NAME, "predictions-container"))) #Smarter waiting than before
element = driver.find_element(By.CLASS_NAME, "predictions-container") # This now sets element properly.
# Do Stuff with element here. After the next switch_to, element will throw selenium.common.exceptions.StaleElementReferenceException
driver.switch_to.default_content() # Switch back out of the frame (to the main page), in case I need to do other stuff. Again, from Shawn's answer.
>Solution :
Desired element is wrapped within an IFRAME. See below:
Use the code below to switch into this FRAME first, then perform the other actions which you want to do:
driver.switch_to.frame(driver.find_element(By.ID, "iFrameResizer0"))
Once you have performed actions within the IFRAME, you can use below code to come out if it:
driver.switch_to.default_content()
