Home xml.etree.ElementTree in Python (3.x) is not parsing one particular attribute value

Questions

xml.etree.ElementTree in Python (3.x) is not parsing one particular attribute value

February 7, 2024

The data file testfile.xml is this:

<?xml version="1.0" encoding="utf-8"?>
<body>
  <body.head>
    <hedline>
      <hl1 style="header">All the things we lost that summer</hl1>
      <hl2 style="standfirst">It was the promise of seals that sold Virginia on this mission.</hl2>
      <hl2 style="dropcap-large"><em class="dropcap">W</em>e are always calling each other names.</hl2>
    </hedline>
  </body.head>
</body>

The script to parse this file is this:

import xml.etree.ElementTree as ET
tree = ET.parse('testfile.xml')
root = tree.getroot()
if root.find('body.head') is not None:
    if root.find('body.head').find('hedline') is not None:
        for child1 in root.find('body.head').find('hedline'):
            print("Tag    level 1:" + child1.tag)
            print("Attrib level 1:" + str(child1.attrib))
            print("Text   level 1:" + str(child1.text) + "\n")
            for child2 in child1:
                print("Tag    level 2:" + child2.tag)
                print("Attrib level 2:" + str(child2.attrib))
                print("Text   level 2:" + str(child2.text))

And this is the result:

Tag    level 1:hl1
Attrib level 1:{'style': 'header'}
Text   level 1:All the things we lost that summer

Tag    level 1:hl2
Attrib level 1:{'style': 'standfirst'}
Text   level 1:It was the promise of seals that sold Virginia on this mission.

Tag    level 1:hl2
Attrib level 1:{'style': 'dropcap-large'}
Text   level 1:None  <-- THIS IS THE PROBLEM

Tag    level 2:em
Attrib level 2:{'class': 'dropcap'}
Text   level 2:W

I would expect the report line "Text level 1:" to report the value "e are always calling each other names." from the data file, but instead it cannot parse it so it ends up being None.
Can you perhaps parse it correctly?
This is Python 3.12 on Windows.

Thanks, Martijn