Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

getting error in selenium while scraping amazon data

I created a program to scrap data from amazon but getting errors which i am unable to understand. I am using Xpath to locate classes and i tried to extract books names on a amazon page. I am searching amazon with a keyword hacking books and it successfully searches it but it does not give result after searching it. I tried following code

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
import time as t
import pandas as pd

driver = webdriver.Chrome(executable_path='chromedriver.exe')

wait = WebDriverWait(driver, 5)
url = "https://www.amazon.com"
driver.get(url)

keyword = "hacking books"
search_book = driver.find_element(By.ID,'twotabsearchtextbox')
search_book.send_keys(keyword)
search_button = driver.find_element(By.ID,'nav-search-submit-button')
search_button.click()

big_list = []

while True:
    try:
        items = wait.until(EC.presence_of_all_elements_located((By.XPATH, '//a[@class =alink-normal s-underline-text s-underline-link-text s-link-style a-text-normal]')))
        for i in items:
            big_list.append((i.text, i.get_attribute('href')))      
        next_page_button = wait.until(EC.element_to_be_clickable((By.XPATH, '//span[@class=s-pagination-strip]//a[contains(text(), "Next")]')))        
        next_page_button.location_once_scrolled_into_view
        t.sleep(10)
        next_page_button.click()
        print('clicked, going to next page')
        t.sleep(10)
    except TimeoutException:
        print('all pages done')
        break
df = pd.DataFrame(big_list, columns = ['Book', 'Url'])
print(df)
df.to_csv('hacking_books.csv')
driver.quit()

Can you help to find bug.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

your Xpath is not a valid expression
A valid xpath expression is like

Relative Path:'//tagname[@attribute=""]'

So you just have to use double quotes

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading