import requests
from bs4 import BeautifulSoup
URL = "https://www.hockey-reference.com/leagues/NHL_2021_games.html"
page = requests.get(URL)
soup = BeautifulSoup(page.content, "html.parser")
results = soup.find(id="all_games")
table = soup.find('div', attrs = {'id':'div_games'})
print(table.prettify())
>Solution :
Select the table not the div to print the table:
table = soup.find('table', attrs = {'id':'games'})
print(table.prettify())
Or use pandas.read_html() to get the table and transform into a dataframe:
import pandas as pd
pd.read_html('https://www.hockey-reference.com/leagues/NHL_2021_games.html', attrs={'id':'games'})[0].iloc[:,:5]
Output:
| Date | Visitor | G | Home | G.1 |
|---|---|---|---|---|
| 2021-01-13 | St. Louis Blues | 4 | Colorado Avalanche | 1 |
| 2021-01-13 | Vancouver Canucks | 5 | Edmonton Oilers | 3 |
| 2021-01-13 | Pittsburgh Penguins | 3 | Philadelphia Flyers | 6 |
| 2021-01-13 | Chicago Blackhawks | 1 | Tampa Bay Lightning | 5 |
| 2021-01-13 | Montreal Canadiens | 4 | Toronto Maple Leafs | 5 |
| … | … | … | … | … |