I am trying to pull all the tables from this link: https://www.baseball-reference.com/awards/awards_2017.shtml
But I am only getting the first two tables, AL MVP Voting & NL MVP Voting but I’m not getting any of the tables after it, AL/NL Cy Young Voting, AL/NL Rookie of the Year Voting, etc.
Heres the code I am using:
url = f'https://www.baseball-reference.com/awards/awards_2017.shtml'
page = requests.get(url)
soup = BeautifulSoup(page.text, 'html')
soup
table = soup.find_all('table')
table[2]
I tried the code and expected all tables to come up but I am only getting the first 2, and am getting None for the third one and beyond.
>Solution :
The tables are embedded and hidden in comments, so simplest way to bring them up would be to uncomment them for example with .replace('<!--','').replace('-->','')
An alternative to be more specific is the use of bs4.Comment
Example
import requests
from bs4 import BeautifulSoup
soup = BeautifulSoup(
requests.get('https://www.baseball-reference.com/awards/awards_2017.shtml').text.replace('<!--','').replace('-->','')
)
table = soup.find_all('table')
table[2]