Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to read in an xml file in Python without node

I am trying to read in in Python this file

https://www.europarl.europa.eu/meps/en/full-list/xml/a

And I have used this code

from bs4 import BeautifulSoup as bs
import requests
import pandas as pd

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36'
}
url = 'https://www.europarl.europa.eu/meps/en/full-list/xml/a'
soup = bs(requests.get(url, headers=headers).text, 'lxml')
df = pd.read_xml(str(soup))
print(df)

But, the result looks wrong.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

   meps
0   NaN

Can anyone help me please?

>Solution :

No need to use intermediate libraries, read_xml can handle a URL:

df = pd.read_xml('https://www.europarl.europa.eu/meps/en/full-list/xml/a')

If you need to pass custom header, use storage_options:

url = 'https://www.europarl.europa.eu/meps/en/full-list/xml/a'

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36'
}

df = pd.read_xml(url, storage_options=headers)

Output:

              fullName   country                                     politicalGroup      id                         nationalPoliticalGroup
0  Magdalena ADAMOWICZ    Poland  Group of the European People's Party (Christia...  197490                                    Independent
1          Asim ADEMOV  Bulgaria  Group of the European People's Party (Christia...  189525  Citizens for European Development of Bulgaria
2    Isabella ADINOLFI     Italy  Group of the European People's Party (Christia...  124831                                   Forza Italia
3      Matteo ADINOLFI     Italy                       Identity and Democracy Group  197826                                           Lega
4    Alex AGIUS SALIBA     Malta  Group of the Progressive Alliance of Socialist...  197403                               Partit Laburista
...
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading