Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Downloading pdf files from a php server using python

I am trying to download the PDFs (a few can be word files, very rarely) located on a PHP server. It appears that on the server, the PDFs are numbered increasingly from 1 to 14000. The PDFs can be downloaded using the link: http://ppmoe.dot.ca.gov/des/oe/awards/bidsum/dl.php?id=X, where X is a number in the [1, 14000] range. I am using the following code for X = 200, which I can then loop over all the [1, 14000] values to save all the files in a specific folder:

import requests

url = "http://ppmoe.dot.ca.gov/des/oe/awards/bidsum/dl.php?id=200"

s = requests.Session()
response = s.get(url)

with open("file200.pdf", "w") as f:
    f.write(response.content)
    f.close()

But it’s returning the following error:

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
TypeError: write() argument must be str, not bytes

I’m unsure if we can download these files using python, and PHP is unfamiliar to me. Thanks!

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You need to add b to the argument so it writes the data to the file as binary data (response.content contains bytes, not a string):

with open("file200.pdf", "wb") as f:
    f.write(response.content)
    f.close()
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading