Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Web-scraping script is not finding element given its class name. Unsure what I'm missing in this code

Problem Description:

I’m trying to get the URL of an image on this page: https://www2.hm.com/en_us/productpage.1109917007.html

This first picture (pic #1) shows the desired image highlighted in blue + its class name. When you click on that image, you then get its fullscreen image (pic #2), whose URL I’m trying to get.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I wrote the code so that it clicks on the image and then gets the URL of the fullscreen image. But, I am not getting any output. What am I missing here?

pic #1

pic #2

What I tried (code below): I tried opening the page to the product listing, getting the element of the desired image by its class, clicking on the image, and then retrieving its url.

What I expected: I expected for this code to retrieve the desired images’ URL and print "Found Image!"


from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By 
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC 
import time


def web_driver():
    options = webdriver.ChromeOptions()
    options.add_argument("--verbose")
    options.add_argument('--no-sandbox')
    options.add_argument('--headless')
    options.add_argument('--disable-gpu')
    options.add_argument("--window-size=1920, 1200")
    options.add_argument('--disable-dev-shm-usage')
    driver = webdriver.Chrome(options=options)
    return driver

driver = web_driver()
driver.get('https://www2.hm.com/en_us/productpage.1109917007.html')
delay = 5
image_urls = set()
thumbnails = driver.find_elements(By.CLASS_NAME, "product-detail-thumbnail-image")
print(thumbnails)
for img in thumbnails:
  try:
    img.click()
    time.sleep(delay)
  except:
    continue
  images = driver.find_elements(By.CLASS_NAME, "current fullscreen-image-modal-image")
  for image in images:
    if image.get_attribute('src') and 'http' in image.get_attribute('src'):
      image_urls.add(image.get_attribute('src'))
      print('Found image!')
      continue

>Solution :

The info about the product is embedded inside <script> element, so you can use re/ast.literal_eval to parse it:

import json
import re
from ast import literal_eval

import requests

product_id = "1109917007"  # <-- this is from URL, last 3 characters is color variant(?)
url = "https://www2.hm.com/en_us/productpage.1109917007.html"

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/116.0"
}

html_text = requests.get(url, headers=headers).text
html_text = re.sub(r"isDesktop \? '([^']+)' : '([^']+)'", r"'\g<1>'", html_text)
html_text = re.sub(r"true", r"True", html_text)
html_text = re.sub(r"false", r"False", html_text)
html_text = re.sub(r"null", r"None", html_text)

data = re.search(r"productArticleDetails = (\{.*?\});", html_text, flags=re.S).group(1)
data = literal_eval(data)

# print all data:
# print(json.dumps(data, indent=4))

for k in data:
    if k.startswith(product_id[:-3]):
        print("Color:", data[k]["name"])
        print("-" * 80)
        for i in data[k]["images"]:
            print("https:" + i["fullscreen"])
        print()

Prints:

Color: Black
--------------------------------------------------------------------------------
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2F98%2F2d%2F982d385e56684698281797aaccccddd2fbb24000.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5B%5D%2Ctype%5BLOOKBOOK%5D%2Cres%5Bm%5D%2Chmver%5B1%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2F5d%2F5e%2F5d5e407e0d81f539f191f66832b80041e5cdb4ed.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5B%5D%2Ctype%5BLOOKBOOK%5D%2Cres%5Bm%5D%2Chmver%5B1%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2F05%2Ff3%2F05f369cfae2758a389c53279860e37b1cc491163.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5B%5D%2Ctype%5BLOOKBOOK%5D%2Cres%5Bm%5D%2Chmver%5B1%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2Fa6%2F9d%2Fa69db194655f2726ae046f8431b26a51223a06de.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5B%5D%2Ctype%5BLOOKBOOK%5D%2Cres%5Bm%5D%2Chmver%5B1%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2Fe3%2Fef%2Fe3ef8cfda92d48f3462ced35b6b20e1701175610.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5B%5D%2Ctype%5BLOOKBOOK%5D%2Cres%5Bm%5D%2Chmver%5B1%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2Fea%2F4d%2Fea4db6703dacf6d08a5096fd35d64f22635129a7.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5B%5D%2Ctype%5BLOOKBOOK%5D%2Cres%5Bm%5D%2Chmver%5B1%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2Fcb%2F26%2Fcb26ad968570dfd16ce1b15a23fbfe745600c460.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5B%5D%2Ctype%5BLOOKBOOK%5D%2Cres%5Bm%5D%2Chmver%5B1%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2F99%2Faf%2F99af20288ff42b2b37150e3f9e08cf4c41779d46.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5B%5D%2Ctype%5BLOOKBOOK%5D%2Cres%5Bm%5D%2Chmver%5B1%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2Fff%2F81%2Fff811ffbce5597233b7411c224ecf026f7a9e458.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5B%5D%2Ctype%5BDESCRIPTIVESTILLLIFE%5D%2Cres%5Bm%5D%2Chmver%5B2%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2F0b%2Fd0%2F0bd00adae7d1feb9745ec3094783e18984d679f8.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5B%5D%2Ctype%5BDESCRIPTIVEDETAIL%5D%2Cres%5Bm%5D%2Chmver%5B2%5D&call=url[file:/product/fullscreen]

Color: Green
--------------------------------------------------------------------------------
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2F34%2F30%2F343092fe2836f8180599906a22aedfe796ace04f.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5B%5D%2Ctype%5BLOOKBOOK%5D%2Cres%5Bm%5D%2Chmver%5B1%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2F79%2F10%2F791029a9dbd1ba9d7f350f7dc6b73d7f0fd3c50f.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5B%5D%2Ctype%5BLOOKBOOK%5D%2Cres%5Bm%5D%2Chmver%5B1%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2Fb0%2F84%2Fb084247de8696d49d0b7f1196283a37738815b4d.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5B%5D%2Ctype%5BLOOKBOOK%5D%2Cres%5Bm%5D%2Chmver%5B1%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2Fb2%2F98%2Fb298c1d2b2eb7b9de8c8b71c838c3a79f636ad25.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5B%5D%2Ctype%5BLOOKBOOK%5D%2Cres%5Bm%5D%2Chmver%5B1%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2F2f%2Fd9%2F2fd9d76b9f220be376581694315e63df21f62e3b.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5B%5D%2Ctype%5BLOOKBOOK%5D%2Cres%5Bm%5D%2Chmver%5B1%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2F01%2F66%2F016612842a3217f82189b2d8e9284a81d0f11f0c.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5Bladies_dresses_longsleevedress%5D%2Ctype%5BDESCRIPTIVESTILLLIFE%5D%2Cres%5Bm%5D%2Chmver%5B2%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2F57%2F08%2F5708c1a6ef58bc2d9d6149360de38b24866aa9ee.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5Bladies_dresses_longsleevedress%5D%2Ctype%5BDESCRIPTIVEDETAIL%5D%2Cres%5Bm%5D%2Chmver%5B2%5D&call=url[file:/product/fullscreen]

Color: Burgundy
--------------------------------------------------------------------------------
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2F77%2F1a%2F771a7c54e25f342b23993d67a03da7ef3a7f0227.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5B%5D%2Ctype%5BLOOKBOOK%5D%2Cres%5Bm%5D%2Chmver%5B1%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2F78%2F25%2F782577fbee3259e57daebacd77deae987952dd52.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5B%5D%2Ctype%5BLOOKBOOK%5D%2Cres%5Bm%5D%2Chmver%5B1%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2F3b%2F9c%2F3b9c910350c73703e9fa19c04954469bc5092ee0.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5B%5D%2Ctype%5BLOOKBOOK%5D%2Cres%5Bm%5D%2Chmver%5B1%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2Ff2%2F7a%2Ff27ac48abcd1fd3af4930625349db11e4304e8e5.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5B%5D%2Ctype%5BLOOKBOOK%5D%2Cres%5Bm%5D%2Chmver%5B1%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2F9e%2F1b%2F9e1bca254c6cabaf0924e4f12b9d5c3de0a5e079.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5Bladies_dresses_longsleevedress%5D%2Ctype%5BDESCRIPTIVESTILLLIFE%5D%2Cres%5Bm%5D%2Chmver%5B2%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2F13%2F33%2F13337567931c75ca1a59b4c02dddf6f45b61ffb2.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5Bladies_dresses_longsleevedress%5D%2Ctype%5BDESCRIPTIVEDETAIL%5D%2Cres%5Bm%5D%2Chmver%5B2%5D&call=url[file:/product/fullscreen]

Color: Light beige
--------------------------------------------------------------------------------
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2Fad%2F79%2Fad79561ecb2ab9fdb3c77cd0aff996628b63bd6a.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5B%5D%2Ctype%5BLOOKBOOK%5D%2Cres%5Bm%5D%2Chmver%5B1%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2F3c%2F79%2F3c790c4ae0949a1690da550293b2c2f57c01772c.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5B%5D%2Ctype%5BLOOKBOOK%5D%2Cres%5Bm%5D%2Chmver%5B1%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2F17%2F13%2F17137a1449b12fc42a8c69042408e6d57e912ea3.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5B%5D%2Ctype%5BLOOKBOOK%5D%2Cres%5Bm%5D%2Chmver%5B1%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2Fde%2Fe9%2Fdee94a36a5898f9008680f79070bf271bf6d5790.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5Bladies_dresses_longsleevedress%5D%2Ctype%5BDESCRIPTIVESTILLLIFE%5D%2Cres%5Bm%5D%2Chmver%5B2%5D&call=url[file:/product/fullscreen]
https://lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2Fc1%2F8c%2Fc18c37d8f1ae19071002165b988087c1984e075d.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5Bladies_dresses_longsleevedress%5D%2Ctype%5BDESCRIPTIVEDETAIL%5D%2Cres%5Bm%5D%2Chmver%5B2%5D&call=url[file:/product/fullscreen]

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading