Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

JSONDecodeError: Expecting value: line 1 column 1 (char 0) when scaping SEC EDGAR

My codes are as follows:

import requests
import urllib
from bs4 import BeautifulSoup

year_url = r"https://www.sec.gov/Archives/edgar/daily-index/2020/index.json"
year_content = requests.get(year_url)
decoded_year_url = year_content.json()

I could run the exactly same codes last year, but when I ran it yesterday, the warning popped up:
"JSONDecodeError: Expecting value: line 1 column 1 (char 0)"
Why? How should I solve the problem? Thanks a lot!

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Apparently the SEC has added rate-limiting to their website, according to this GitHub issue from May 2021. The reason why you’re receiving the error message is that the response contains HTML, rather than JSON, which causes requests to raise an error upon calling .json().

To resolve this, you need to add the User-agent header to your request. I can access the JSON with the following:

import requests
import urllib
from bs4 import BeautifulSoup

year_url = r"https://www.sec.gov/Archives/edgar/daily-index/2020/index.json"
year_content = requests.get(year_url, headers={'User-agent': '[specify user agent here]'})
decoded_year_url = year_content.json()
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading