Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

XPath: How do I capture the previous element?

I have such a construction

<p>File name</p>
<a href="https://somelink.pdf">Download</a>

I need to capture the link a and its name p using CSS and XPath. I’m trying to do the following, first I find using the CSS selector all files whose href values end in .pdf (a[href$=".pdf"]):

for i in response.css('a[href$=".pdf"]'):
    link = i.css('::attr("href")').get()
    name = i.xpath(?????????)
    print(name, link)

How do I capture the text in the p element using XPath?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Starting from a

This XPath,

//a[.="Download"]/preceding-sibling::p[1]

will select the first p element siblings preceding each a element whose string value equals "Download".


Starting from p

This XPath,

//p[.="File name"]/following-sibling::a[1]

will select the first a element siblings following each p element whose string value equals "File name".


In either case, you can select the text node child by appending /text() to the XPaths.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading