How to identify the correct class for Beautifulsoup?

January 17, 2022

I am trying to learn Scrapping , one problem I am facing is identifying correct class names , is there any particular rule/method to follow for identifying correct class names For Example in the code below I am trying to get Questions lists from stackoverflow page , for that I am clicking on inspect on the first question & i can see classname as question-hyperlink but when I try in the code below I get empty results , similarly if iI am trying with divname summary I get same empty results kindly guide on how can I fix this & avoid in future cases

import requests
from bs4 import BeautifulSoup
 
website = 'https://stackoverflow.com/'
r = requests.get(website)

if r.status_code == 200:
    print(f"Connected to {website}")
    soup = BeautifulSoup(r.content, 'html.parser')
    s = soup.find_all(class_name='question-hyperlink')
    print(s)
else:
    print(r)
    
print("Done")

>Solution :

The url you’re using doesn’t have any questions – https://stackoverflow.com shows only a starting page, unless you’re logged in.

You need to change the url to https://stackoverflow.com/questions.

Also, you should be using class_=, not class_name= in find_all().

Then it works just fine.

import requests
from bs4 import BeautifulSoup

website = 'https://stackoverflow.com/questions/'

headers = {
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:95.0) Gecko/20100101 Firefox/95.0",
}
r = requests.get(website, headers=headers)

if r.status_code == 200:
    print(f"Connected to {website}")
    soup = BeautifulSoup(r.text, 'html.parser').find_all("a", class_="question-hyperlink")
    print(len(soup))

Output: