Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Scraping data through changing Xpaths

I can’t figure out how to scrape data, I am trying to scrape the product name, price and other information from the website, the product names are easy to access as they have similar xpath with only one tag that changes but for the prices the there are multiple changes to the tags.Is there an alternative to how I can scrape data without the xpath as class name and ID return an empty string.


driver= webdriver.Chrome('E:/chromedriver/chromedriver.exe')
product_name=[]
product_price=[]
product_rating=[]
product_url=[]
driver.get('https://www.cdiscount.com/bricolage/climatisation/traitement-de-l-air/ioniseur/l-166130303.html#_his_')
for i in range(1,55):
    try :
        productname=driver.find_element('xpath','//*[@id="lpBloc"]/li['+str(i)+']/a/div[2]/div/span').text
        product_name.append(productname)
    except:
        print("none")
print(product_name)'''


Xpath of the price:

1st items price
```//*[@id="lpBloc"]/li[1]/div[2]/div[3]/div[1]/div/div[2]/span[1]```

2nd items price
'''//*[@id="lpBloc"]/li[2]/div[2]/div[2]/div[1]/div/div[2]/span[1]'''



>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

You no need to use hardcoded loop, rather identify unique xpath to identify parent element and then child element. Only ratings not avail for every product where you can use try..except block.

product_name=[]
product_price=[]
product_rating=[]
product_url=[]
driver.get('https://www.cdiscount.com/bricolage/climatisation/traitement-de-l-air/ioniseur/l-166130303.html#_his_')
for item in driver.find_elements(By.XPATH,'//*[@id="lpBloc"]//li[@data-sku]'):
    
        productname=item.find_element('xpath','.//span[@class="prdtTit"]').text
        product_name.append(productname)
        productprice=item.find_element('xpath','.//span[@class="price priceColor hideFromPro"]').text
        product_price.append(productprice)
        try:
          productRating=item.find_element('xpath','.//span[@class="c-stars-rating"]//span[@class="u-visually-hidden"]').text
          product_rating.append(productRating)
        except:
          productRating="Nan"
          product_rating.append(productRating)
          
        productUrl=item.find_element('xpath','.//a[.//span[@class="prdtTit"]]').get_attribute("href")
        product_url.append(productUrl)
        
print(product_name)
print(product_price)
print(product_rating)
print(product_url)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading