Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

HTML Div Section with Categories

Hope this is an easy question, but I’ve struggled to find a solution or an explanation. Like others, I’m am attempting to pull the full Michelin list for an area, not only those ranked with Stars. The listing and details on the restaurant exists in this Div section, but I don’t understand how one would either select or parse out the data-* variables. I’m able to write out a selection to isolate this Div, but lack next step to make it useful

<div class="card__menu-footer d-flex js-match-height-footer">
<div class="card__menu-like box-placeholder js-favorite-restaurant" data-pid="1204329" data-enabled="false"
     data-category="restaurant.result"
     data-cooking-type="Japanese"
     data-country="ee"
     data-guide="Estonia"
     data-language="en"
     data-dtm-chef=""
     data-dtm-city="New York"
     data-dtm-distinction=""
     data-dtm-district="Manhattan"
     data-dtm-id="1204329"
     data-dtm-online-booking="False"
     data-dtm-price="none"
     data-dtm-region="New York State"
     data-restaurant-country="us"
     data-restaurant-name="Joji"
     data-restaurant-selection="USA">
    <img src="/assets/images/icons/love-off-58dca5751a8ad8f50468df25d762b097.svg" class="love-this pl-image" alt=""/>
</div>
</div>

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

You can use .attrs property to access the tag attributes. Here is an example how you can parse the data-* attributes to a dict:

from bs4 import BeautifulSoup

html_doc = """\
<div class="card__menu-footer d-flex js-match-height-footer">
<div class="card__menu-like box-placeholder js-favorite-restaurant" data-pid="1204329" data-enabled="false"
     data-category="restaurant.result"
     data-cooking-type="Japanese"
     data-country="ee"
     data-guide="Estonia"
     data-language="en"
     data-dtm-chef=""
     data-dtm-city="New York"
     data-dtm-distinction=""
     data-dtm-district="Manhattan"
     data-dtm-id="1204329"
     data-dtm-online-booking="False"
     data-dtm-price="none"
     data-dtm-region="New York State"
     data-restaurant-country="us"
     data-restaurant-name="Joji"
     data-restaurant-selection="USA">
    <img src="/assets/images/icons/love-off-58dca5751a8ad8f50468df25d762b097.svg" class="love-this pl-image" alt=""/>
</div>
</div>"""

soup = BeautifulSoup(html_doc, "html.parser")

div = soup.select_one(".js-favorite-restaurant")

out = {}
for attr, value in div.attrs.items():
    if attr.startswith("data-"):
        attr = attr.split("-", maxsplit=1)[-1]
        out[attr] = value

print(out)

Prints:

{
    "pid": "1204329",
    "enabled": "false",
    "category": "restaurant.result",
    "cooking-type": "Japanese",
    "country": "ee",
    "guide": "Estonia",
    "language": "en",
    "dtm-chef": "",
    "dtm-city": "New York",
    "dtm-distinction": "",
    "dtm-district": "Manhattan",
    "dtm-id": "1204329",
    "dtm-online-booking": "False",
    "dtm-price": "none",
    "dtm-region": "New York State",
    "restaurant-country": "us",
    "restaurant-name": "Joji",
    "restaurant-selection": "USA",
}
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading