Unable to connect to a url with python request module

I’m unable to connect to the URL using requests module, but it works fine when browsed in a browser. Could it be some robots.txt issue Allowed/Disallowed issue ?

Below is the codebase.

import requests

r = requests.get('https://myntra.com')

print(r)

>Solution :

Some websites block access from non-web browser ‘User-Agents’ to prevent web scraping, including from the default Python’s requests ‘User-Agent’.

so you need to pass a user agent as like a web browser, for example:

r = requests.get('https://myntra.com/', headers = {
        "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:100.0) Gecko/20100101 Firefox/100.0",
    },)

The ‘User-Agent’ string contains information about which browser is being used, what version and on which operating system.

Related

Leave a ReplyCancel reply