Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python | Selenium Issue with scrolling down and find by class name

For one study research I would like to scrape some links from webpages which located out of viewport (to see this links you need to scroll down the page).

I have write the script but I get empty list:

from selenium import webdriver
import time

browser = webdriver.Firefox()
browser.get('https://www.twitch.tv/lirik')

time.sleep(3)
browser.execute_script("window.scrollBy(0,document.body.scrollHeight)")

time.sleep(3)

panel_blocks = browser.find_elements(by='class name', value='Layout-sc-nxg1ff-0 itdjvg default-panel')
browser.close()
print(panel_blocks)
print(type(panel_blocks))

I just get empty list after page was loaded. Here is output from the script above:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

/usr/local/bin/python /Users/greg.fetisov/PycharmProjects/baltazar_platform/Twitch_parser.py
[]
<class 'list'>

Process finished with exit code 0

p.s.
when webdriver opens the page, I see there is no scroll down action. It just open a page and then close it after time.sleep cooldown.

How I can change the script to get the links properly?

Any help or advice would be appreciated!

>Solution :

To print the values of the href attribute you have to induce WebDriverWait for the visibility_of_all_elements_located() and you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR:

    driver.get("https://www.twitch.tv/lirik")
    print([my_elem.get_attribute("href") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.Layout-sc-nxg1ff-0.itdjvg.default-panel > a")))])
    
  • Console Output:

    ['https://www.amazon.com/dp/B09FVR22R2', 'http://bs.serving-sys.com/Serving/adServer.bs?cn=trd&pli=1077437714&gdpr=$%7BGDPR%7D&gdpr_consent=$%7BGDPR_CONSENT_68%7D&adid=1085757156&ord=[timestamp]', 'https://store.epicgames.com/lirik/rumbleverse', 'https://bitly/3GP0cM0', 'https://lirik.com/', 'https://streamlabs.com/lirik', 'https://twitch.amazon.com/tp', 'https://www.twitch.tv/subs/lirik', 'https://www.youtube.com/lirik?sub_confirmation=1', 'http://www.twitter.com/lirik', 'http://www.instagram.com/lirik', 'http://gfuel.ly/lirik', 'http://www.cyberpowerpc.com/', 'https://www.cyberpowerpc.com/page/Intel/LIRIK/', 'https://discord.gg/lirik', 'http://www.amazon.com/?_encoding=UTF8&camp=1789&creative=390957&linkCode=ur2&tag=l0e6d-20&linkId=YNM2SXSSG3KWGYZ7']
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading