Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

lxml xpass can't find a tag below first one in xml

I have an xml doc that looks something like this

<MyXmlRoot>
<App xmlns='urn:SomethingSomething1'>
    ...
</App>
<User xmlns='urn:SomethingSomething2'>
    ...
</User>
<Doc xmlns='urn:SomethingSomething3'>
    <level2>
        <level3>
            <level4>
                <level5>
                    <level6>
                        <level7>
                            <level8>
                                <level9>
                                    <level10>Content at the deepest level</level10>
                                </level9>
                            </level8>
                        </level7>
                    </level6>
                </level5>
            </level4>
        </level3>
    </level2>
</Doc>

I use lxml to read it and parse it like this

tree = etree.parse("textxml.xml")
root = tree.getroot()

if I do pretty print from root it will show the entire xml. which is good but when I try to read specific tags values like so

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

content = root.xpath('//level10/text()')

xpath can’t find any tag below the root and returns empty list
I suspect it’s because of the namespaces but can’t find a solution to make xpath read values
any advice ?

>Solution :

Add xmlns {urn:SomethingSomething3} to the tag you want to search:

from lxml import etree

xml_data = """
<MyXmlRoot>
    <App xmlns='urn:SomethingSomething1'>
    </App>
    <User xmlns='urn:SomethingSomething2'>
    </User>
    <Doc xmlns='urn:SomethingSomething3'>
        <level2>
            <level3>
                <level4>
                    <level5>
                        <level6>
                            <level7>
                                <level8>
                                    <level9>
                                        <level10>Content at the deepest level</level10>
                                    </level9>
                                </level8>
                            </level7>
                        </level6>
                    </level5>
                </level4>
            </level3>
        </level2>
    </Doc>
</MyXmlRoot>
"""

root = etree.fromstring(xml_data)

level10_text = root.find(".//{urn:SomethingSomething3}level10").text
print("Text from <level10> tag:", level10_text)

Prints:

Text from <level10> tag: Content at the deepest level

OR: Use etree.ETXPath:

to_search = etree.ETXPath("//{urn:SomethingSomething3}level10/text()")
level10_text = to_search(root)
print("Text from <level10> tag:", level10_text)

Prints:

Text from <level10> tag: ['Content at the deepest level']
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading