Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

xml.parsers.expat.ExpatError: not well-formed (invalid token) python3?

I have this code on my python3 for E2 (dreambox)

from xml.dom import Node, minidom
from urllib.request import urlopen, Request
selectedserverurl = 'http://fairbird.liveblog365.com/TSpanel/TSipanel.xml'

def downloadxmlpage():
        req = Request(selectedserverurl)
        response = urlopen(req)
        data = response.read()
        response.close()
        print("data:",data)
        gotPageLoad(data)
        print("gotPageLoad(data):", gotPageLoad(data))

def gotPageLoad(data = None):
        if data != None:
            xmlparse = minidom.parseString(data)
            for plugins in xmlparse.getElementsByTagName('plugins'):
                item = plugins.getAttribute('cont')
                if 'TSpanel' in item:
                    for plugin in plugins.getElementsByTagName('plugin'):
                        tsitem = plugin.getAttribute('name')
                        print("tsitem:", tsitem)

downloadxmlpage()

I have try to read this file and extract the content from it
http://fairbird.liveblog365.com/TSpanel/TSipanel.xml

But I have got this error !!

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

data: b'<html><body><script type="text/javascript" src="/aes.js" ></script><script>function toNumbers(d){var e=[];d.replace(/(..)/g,function(d){e.push(parseInt(d,16))});return e}function toHex(){for(var d=[],d=1==arguments.length&&arguments[0].constructor==Array?arguments[0]:arguments,e="",f=0;f<d.length;f++)e+=(16>d[f]?"0":"")+d[f].toString(16);return e.toLowerCase()}var a=toNumbers("f655ba9d09a112d4968c63579db590b4"),b=toNumbers("98344c2eee86c3994890592585b49f80"),c=toNumbers("55cc7e99e3f798b6063f25e8b0f8aa76");document.cookie="__test="+toHex(slowAES.decrypt(c,2,a,b))+"; expires=Thu, 31-Dec-37 23:55:55 GMT; path=/"; location.href="http://fairbird.liveblog365.com/TSpanel/TSipanel.xml?i=1";</script><noscript>This site requires Javascript to work, please enable Javascript in your browser or use a browser with Javascript support</noscript></body></html>'
Traceback (most recent call last):
  File "/home/raed/Desktop/test.py", line 24, in <module>
    downloadxmlpage()
  File "/home/raed/Desktop/test.py", line 11, in downloadxmlpage
    gotPageLoad(data)
  File "/home/raed/Desktop/test.py", line 16, in gotPageLoad
    xmlparse = minidom.parseString(data)
  File "/usr/lib/python3.10/xml/dom/minidom.py", line 2000, in parseString
    return expatbuilder.parseString(string)
  File "/usr/lib/python3.10/xml/dom/expatbuilder.py", line 925, in parseString
    return builder.parseString(string)
  File "/usr/lib/python3.10/xml/dom/expatbuilder.py", line 223, in parseString
    parser.Parse(string, True)
xml.parsers.expat.ExpatError: not well-formed (invalid token): line 1, column 222

So How to solve this issue ?!!

>Solution :

Your data output is HTML, not an XML file, therefore the parser is failing.

The HTML redirects to http://fairbird.liveblog365.com/TSpanel/TSipanel.xml?i=1 using Javascript, as shown – This site requires Javascript to work.

This is typically done to prevent anyone from scraping the page/server-files.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading