Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

XML: Print previous Element of the findall() function

I’m working with an xml corpus that looks like this:

<corpus>
  <dialogue speaker="A">
    <sentence tag1="attribute1" tag2="attribute2"> Hello </sentence>
  </dialogue>
  <dialogue speaker="B">
    <sentence tag1="different_attribute1" tag2= "different_attribute2"> How are you </sentence>
  </dialogue>
</corpus>

I use root.findall() to search for all instances of "different_attribute2", but then I would like to print not only the parent element that contains the attribute but also the element that comes before that:

{'speaker': 'A'}
Hello
{'speaker':'B'}
How are you

I’m quite new at coding, so I’ve tried a bunch of for loops and if statements without result. I start with:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

for words in root.findall('.//sentence[@tag2="different_attribute2"]'):
    for speaker in root.findall('.//sentence[@tag2="different_attribute2"]...'):
        print(speaker.attrib)
        print(words.text)

But then I have absolutely no idea on how to retrieve Speaker A. Can anyone help me?

>Solution :

Using lxml and with a single xpath to find all elements:

>>> from lxml import etree
>>> tree = etree.parse('/home/lmc/tmp/test.xml')
>>> for e in tree.xpath('//sentence[@tag2="different_attribute2"]/parent::dialogue/@speaker | //sentence[@tag2="different_attribute2"]/text() | //dialogue[following-sibling::dialogue/sentence[@tag2="different_attribute2"]]/sentence/text() | //dialogue[following-sibling::dialogue/sentence[@tag2="different_attribute2"]]/@speaker'):
...      print(e)
... 
A
 Hello 
B
 How are you 

Xpath details

Find speaker B
//sentence[@tag2="different_attribute2"]/parent::dialogue/@speaker

Find sentence of B
//sentence[@tag2="different_attribute2"]/text()

Find sentence of A given B
//dialogue[following-sibling::dialogue/sentence[@tag2="different_attribute2"]]/sentence/text()

Find speaker=A given B
//dialogue[following-sibling::dialogue/sentence[@tag2="different_attribute2"]]/@speaker'

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading