Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Beautiful Soup Scraping

I am trying to scrape lineups from https://www.rotowire.com/hockey/nhl-lineups.php

I would like a resulting dataframe like the following

Team Position Player Line
CAR C Sebastian Aho Power Play #1
CAR LW Stefan Noesen Power Play #1

….

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

This is what I have currently, but am unsure how to get the team and line to matchup with the players/positions as well as put into a dataframe

import requests, pandas as pd
from bs4 import BeautifulSoup

url = "https://www.rotowire.com/hockey/nhl-lineups.php"
soup = BeautifulSoup(requests.get(url).text, "html.parser")

lineups = soup.find_all('div', {'class':['lineups']})[0]
names = lineups.find_all('a', title=True)
for name in names:
    name = name.get('title')
    print(name)
positions = lineups.find_all('div',  {'class':['lineup__pos']})
for pos in positions:
    pos = pos.text
    print(pos)

>Solution :

Try:

import pandas as pd
import requests
from bs4 import BeautifulSoup

url = "https://www.rotowire.com/hockey/nhl-lineups.php"

soup = BeautifulSoup(requests.get(url).content, "html.parser")

all_data = []
for a in soup.select(".lineup__player a"):
    name = a["title"]
    pos = a.find_previous("div").text
    line = a.find_previous(class_="lineup__title").text

    lineup = a.find_previous(class_="lineup__list")["class"][-1]
    team = a.find_previous(class_=f"lineup__team {lineup}").img["alt"]

    all_data.append((team, pos, name, line))

df = pd.DataFrame(all_data, columns=["Team", "Pos", "Player", "Line"])
print(df.to_markdown(index=False))

Prints:

Team Pos Player Line
CAR C Sebastian Aho POWER PLAY #1
CAR LW Stefan Noesen POWER PLAY #1
CAR RW Andrei Svechnikov POWER PLAY #1
CAR LD Brent Burns POWER PLAY #1
CAR RD Martin Necas POWER PLAY #1
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading