Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Xpath wrong using selenium

from selenium import webdriver           
import time
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager
from bs4 import BeautifulSoup
import pandas as pd
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from csv import writer


options = webdriver.ChromeOptions()
options.add_argument("--no-sandbox")
options.add_argument("--disable-gpu")
options.add_argument("--window-size=1920x1080")
options.add_argument("--disable-extensions")
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
wait = WebDriverWait(driver, 20) 
            
            
URL = 'https://www.askgamblers.com/online-casinos/reviews/yukon-gold-casino-casino'
driver.get(URL) 



data=driver.find_elements(By.XPATH,"//section[@class='review-text richtext']")
        
for row in data:
    try:
        para0= row.find_element(By.XPATH,"//h2[text()[contains(.,'Games')]]/following-sibling::p[following::h2[text()[contains(.,'Support')]]]").text
    except:
        pass  
    
    print(para0)     

I want they collect the data of Games only but they also get the data of Virtual Games so how we restrict the contains method that get only data of Games only kindly recommend any solution for that these is page link https://www.askgamblers.com/online-casinos/reviews/yukon-gold-casino-casino

Want these only

enter image description here

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

do not get these text of virtual game
enter image description here

>Solution :

[contains(.,'Games')] will match both Games and Virtual Games.
What you can do here is:

  1. Use equals instead of contains, like this:
"[text()='Games']"
  1. or use starts-with:
"[starts-with(text(), 'Games')]"

So this line para0= row.find_element(By.XPATH,"//h2[text()[contains(.,'Games')]]/following-sibling::p[following::h2[text()[contains(.,'Support')]]]").text can be changed to

para0= row.find_element(By.XPATH,"//h2[text()='Games']/following-sibling::p[following::h2[contains(.,'Support')]]").text

or

para0= row.find_element(By.XPATH,"//h2[starts-with(text(), 'Games')]/following-sibling::p[following::h2[contains(.,'Support')]]").text
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading