Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Convert list data into dictionary

I have extracted the content of a script from a website. Now I want to transform the received data list to a dictionary to have an easier search.

The json data (json_data) looks like that (part of):

 'id': 'dungeons-and-raids',
    'name': 'Dungeons & Raids',
    'regionId': 'US',
    'groups': [{
        'content': {
            'lines': [{
                'icon': 'ability_toughness',
                'name': 'Fortified',
                'url': '/affix=10/fortified'
            }, {
                'icon': 'spell_nature_cyclone',
                'name': 'Storming',
                'url': '/affix=124/storming'
            }, {
                'icon': 'ability_ironmaidens_whirlofblood',
                'name': 'Bursting',
                'url': '/affix=11/bursting'
            }],
            'icons': 'large'
        },
        'id': 'mythicaffix',
        'name': 'Mythic+ Affixes',
    },
   ...

And this is my complete Python 3.11 script:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

import re
import json
from urllib.request import Request, urlopen

req = Request(
    "https://www.wowhead.com/today-in-wow", headers={"User-Agent": "Mozilla/5.0"}
)
html_page = urlopen(req).read().decode("utf-8")

json_data = re.search(
    r"TodayInWow\(WH\.ge\('tiw-standalone'\), (.*), true\);", html_page
)
json_data = json.loads(json_data.group(1))

data = {
    (d["id"], d["regionId"]): {dd["id"]: dd for dd in d["groups"]} for d in json_data
}

for affixline, affixp in enumerate(data[("dungeons-and-raids", 
"US")]["mythicaffix"]['content']['lines']):

    affixurl = affixp['url']
    affixname = affixp['name']
    affixid = affixline

This gives me the error:

TypeError: 'NoneType' object is not subscriptable

It seems dd["id"] returns "None", but I don’t know why. What is the correct way to use the dictionary in this case?

>Solution :

Add a check if the returned group from the server is not null:

import re
import json
from urllib.request import Request, urlopen

req = Request(
    "https://www.wowhead.com/today-in-wow", headers={"User-Agent": "Mozilla/5.0"}
)
html_page = urlopen(req).read().decode("utf-8")

json_data = re.search(
    r"TodayInWow\(WH\.ge\('tiw-standalone'\), (.*), true\);", html_page
)
json_data = json.loads(json_data.group(1))

data = {
    (d["id"], d["regionId"]): {dd["id"]: dd for dd in d["groups"] if dd} for d in json_data  # <-- add check if group is not null
}

for affixline, affixp in enumerate(data[("dungeons-and-raids", "US")]["mythicaffix"]['content']['lines']):
    affixurl = affixp['url']
    affixname = affixp['name']
    print(affixurl, affixname)

Prints:

/affix=10/fortified Fortified
/affix=124/storming Storming
/affix=11/bursting Bursting
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading