Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How do I efficiently check if data was returned in my GET request?

I am webscraping and need to parse through a few thousand GET requests at a time. Sometimes these requests fail and I get 429 and/or 403 errors so I need to check if there is data before parsing the response. I wrote this function:

def check_response(response):
    if not response or not response.content:
        return False
    else:
        soup = BeautifulSoup(response.content, "html.parser")
        if not soup or not soup.find_all(attrs={"class": "stuff"}):
            return False
    
    return True

This works, but it can take quite a while to loop through a few thousand responses. Is there a better way?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You can use the response.status_code attribute to check the status code of the response. You can find a full list of HTTP error codes on MDN, but if it is >= 400, then it’s definitely an error. Try using this code:

def check_response(response):
    if not response or not response.content or response.status_code >= 400:
        return False
    else:
        soup = BeautifulSoup(response.content, "html.parser")
        if not soup or not soup.find_all(attrs={"class": "stuff"}):
            return False
        return True

Note that you need to indent your return True one level inwards, or else it will never be called because of the else-statement.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading