Im trying to log in the following url: https://fornecedor2.procon.sp.gov.br/login withouth using Selenium but Requests/Beautifulsoup instead.
When I extract the page’s html I can’t find the actual login form, I tried to analyse the request with devtools but couldnt figure it out. It seems like body tag is hidden somehow. Right now I’m doing the following:
s=requests.Session()
headers={"User-Agent":"Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36"}
s.headers.update(headers)
r = s.get('https://fornecedor2.procon.sp.gov.br/login')
soup = BeautifulSoup(r.content,'html.parser')
input_element = soup.find("//*[@id='mat-input-0']") #This is the xpath but it returns None
>Solution :
beautifulsoup doesn’t support XPath, only CSS selectors + it’s own API.
But for your question: to login to the page, you usually POST to login URL your credentials. In this case you can use this example:
import requests
login_url = 'https://procon-fornecedor-prod.azurewebsites.net/connect/token'
data = {
"grant_type": "password",
"scope": "openid profile email offline_access",
"client_id": "procon-fornecedor-client",
"username": "xxx", # <-- change to real username
"password": "yyy" # <-- change to real password
}
with requests.session() as s:
resp = s.post(login_url, data=data)
print(resp.json())