Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Getting a number from pubchem site with Selenium

I’m doing a search on the pubchem site with the code below. I need to get the "Compound CID:" number from the screen from the search result but I couldn’t get it. I need help on this.

driver = webdriver.Chrome()
url = "https://pubchem.ncbi.nlm.nih.gov/"
driver.get(url)
driver.maximize_window()
searchInput = driver.find_element_by_xpath("/html/body/div[1]/div/div/main/div[1]/div/div[2]/div/div[2]/form/div/div[1]/input")
searchInput.click()
searchInput.send_keys("75-05-8")
searchInput.send_keys(Keys.ENTER)
time.sleep(2)
driver.close()

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

To print the text 6342 you can use either of the following Locator Strategies:

  • Using css_selector and get_attribute("innerHTML"):

    print(driver.find_element(By.CSS_SELECTOR, "a[data-label^='Featured Compound Result Secondary Link; Position:1; Page:1'] > span.breakword > span").get_attribute("innerHTML"))
    
  • Using xpath and text attribute:

    print(driver.find_element(By.XPATH, "//a[starts-with(@data-label, 'Featured Compound Result Secondary Link; Position:1; Page:1')]/span[@class='breakword']/span").text)
    

Ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR and text attribute:

    driver.get("https://pubchem.ncbi.nlm.nih.gov/")
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[type='text'][id^='search']"))).send_keys("75-05-8" + Keys.RETURN)
    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "a[data-label^='Featured Compound Result Secondary Link; Position:1; Page:1'] > span.breakword > span"))).text)
    
  • Using XPATH and get_attribute("innerHTML"):

    driver.get("https://pubchem.ncbi.nlm.nih.gov/")
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//input[@type='text'][starts-with(@id, 'search')]"))).send_keys("75-05-8" + Keys.RETURN)
    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//a[starts-with(@data-label, 'Featured Compound Result Secondary Link; Position:1; Page:1')]/span[@class='breakword']/span"))).text)
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
  • Console Output:

    6342
    

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium – Python


References

Link to useful documentation:

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading