Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Problem with BeautifulSoup when retrieving image src attribute and comparing it

I want to retrieve the list of games played in the FIDE archives (e.g. https://ratings.fide.com/view_source.phtml?code=272077). I manage very well to get all the columns, but to know which player was white or black, I must also get the image which is on the same line as the games (<img align="absbottom" border="0" src="/imga/clr_bl.gif"/> for blacks and <img align="absbottom" border="0" src="/imga/clr_wh.gif"/> for whites. Problem, when I try to set the player variable to 1 for white and 0 for black, my two if conditions don’t work(I also tried with if in, it doesn’t work either). Here is the code:

for row in rows:
    picture = row.find_all('img')
    print("loop")
    cols = row.find_all('td')
    cols = [ele.text.strip() for ele in cols]
    cols.append(picture)
    data.append([ele for ele in cols if ele])
for row in data:
    if "ID" and "Name" in row:
        continue
    if 'Round' and 'Opp. name' in row:
        continue
    if "Game" in row:
        tempdata = {
            "ID" : row[0],
            "Fed" : row[2],
            "Rating" : row[3]
        }
        continue
    if '<img align="absbottom" border="0" src="/imga/clr_wh.gif"/>' == picture[0]:
        player = 1
        print("player is white")
    elif '<img align="absbottom" border="0" src="/imga/clr_bl.gif"/>' == picture[0]:
        player = 0
        print("player is black")
    else:
        player = 1
        print(picture)
        print("an error occured")
    print(player)  

I get

[<img align="absbottom" border="0" src="/imga/clr_bl.gif"/>]
an error occured
1
[<img align="absbottom" border="0" src="/imga/clr_bl.gif"/>]
an error occured
1
[<img align="absbottom" border="0" src="/imga/clr_bl.gif"/>]
an error occured
1
[<img align="absbottom" border="0" src="/imga/clr_bl.gif"/>]
an error occured

and so on.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Try to change the condition to:

...

if "_wh" in picture[0]["src"]:
    player = 1
    print("player is white")
elif "_bl" in picture[0]["src"]:
    player = 0
    print("player is black")

...

EDIT: Example to get player colors:

import requests
from bs4 import BeautifulSoup

url = "https://ratings.fide.com/view_source.phtml?code=272077"
soup = BeautifulSoup(requests.get(url).content, "html.parser")

for row in soup.select("tr:has(img[align]):not(:has(table))"):
    color = "white" if "_wh" in row.img["src"] else "black"
    name = row.select_one("img + a")
    print("{:<10} {}".format(color, name.text if name else "N/A"))

Prints:

white      Tas, Ruzgar
black      Kartop, Metehan
white      Haznedar, Galip
black      Ugur, Cem
white      Yuzsever, Cenk
black      Kiziltas, Inanc Vefa
white      Yurtoglu, Osman Talha
black      Ozcan, Kadir Kutay
white      Acar, Cengiz Can
black      Yardimci, Ramazan

...
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading