How do I save an XML file locally from a website using Python and requests module?

Advertisements

I’m using Python for a web scraping project, and I bumped into this URL that downloads a XML file to my PC.

Is there a way I can access the XML file that’s downloaded when you click the link? I’m ok with saving the XML locally if that’s the only way, but I have no idea how to do so.

I’ve tried using the requests module, but I get the byte string when doing so.

import requests

r = requests.get(
    "https://fnet.bmfbovespa.com.br/fnet/publico/downloadDocumento?id=465601"
)

print(r.content)

>Solution :

You need to specify the request headers to download from that specific site.

Here is how i did it:

import requests

filename = "file.xml"
url = "https://fnet.bmfbovespa.com.br/fnet/publico/downloadDocumento?id=465601"
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
    'Accept-Language': 'en-US,en;q=0.9',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8',
    'Accept-Encoding': 'gzip, deflate, br',
    'Connection': 'keep-alive',
    'DNT': '1',
    'Host': 'fnet.bmfbovespa.com.br',
    'Sec-Fetch-Dest': 'document',
    'Sec-Fetch-Mode': 'navigate',
    'Sec-Fetch-Site': 'none',
    'Sec-Fetch-User': '?1',
    'Sec-GPC': '1',
    'Upgrade-Insecure-Request': '1'
}

r = requests.get("https://fnet.bmfbovespa.com.br/fnet/publico/downloadDocumento?id=465601", headers=headers)

with open(filename, "wb") as file:
    file.write(r.content)

Leave a ReplyCancel reply