I’d like to be able to return the href="stylesheet.xsl" value from an XML file using XPath but I can’t seem to figure out an expression that references attributes within ?-tags if it’s even possible. Sample file:
<?xml-stylesheet type="text/xsl" href="https://foo.bar" ?>
<test>
<test2>
</test2>
</test>
The goal is to figure out which stylesheet a given XML file references, so in this case I’d like to return "https://foo.bar" as the result set. Any ideas?
>Solution :
You can access a Processing Instruction (PI) in the prolog the same way you access it within the rest of an XML document, via a processing-instruction() node test.
Side note: An XML declaration,
<?xml version="1.0" encoding="UTF-8"?>
which can only appear in the prolog, and which looks like a PI, is not a PI and cannot be accessed via XPath.
For example,
//processing-instruction('xml-stylesheet')
will return
type="text/xsl" href="https://foo.bar"
You can then use string processing functions, which vary per version of XPath, to extract whichever pieces of that string you need.
For example, if you’re stuck with only XPath 1.0, then
"substring-before(
substring-after(//processing-instruction('xml-stylesheet')[1],
'href="'),
'"')"
(which is written as if to be embedded in XSLT — finagle the quotes as necessary for other hosting languages) will return
https://foo.bar
See also
- What is the XPath expression to select a Processing instruction?
- Can I use XPath to get the content of a processing instruction in XML?