(Translated from DeepL)
Hey, I’ve got a problem.
I have an html file hosted on github pages who looks like this: (It can be accessed at this site)
<head>
<meta http-equiv="refresh" content="0; url=https://gi-b716.github.io/modrinth-api-tools/api/files/1.2.0.zip">
</head>
Ideally, it should redirect to this 1.2.0.zip after accessing the above html file, and when I accessed it using Google Chrome, it did work that way. However, I wanted to get this file in python, so I wrote this code:
download_url = 'https://gi-b716.github.io/modrinth-api-tools/api/version/d/lasted'
r = requests.get(download_url, allow_redirects=True)
with open("1.2.0.zip", "wb") as f:
f.write(r.content)
f.close()
But I can’t always get the file, even though requests detects a redirect, I can’t find any information about the file in the return.
This is what is returned
This is the redirect he detected
What can I do to capture the correct file?
>Solution :
These types of redirects are built into HTML and not based on headers. You will have to use regex or pattern matching to grab the URL. You can do something like this:
def requests_redirects():
download_url = 'https://gi-b716.github.io/modrinth-api-tools/api/version/d/lasted'
r = requests.get(download_url, allow_redirects=True)
if b'content="0; url=' in r.content:
# regex
pattern = re.compile(r'content="0; url=(.*?)"')
redirect_url = pattern.findall(r.content.decode())[0]
r = requests.get(redirect_url, allow_redirects=True)
with open("1.2.0.zip", "wb") as f:
f.write(r.content)
f.close()