Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to get consistency output from inconsistency

I wanted to grab all the name and price keys at one time but all the first dict key aren’t the same order and I can’t iterate that’s why I have to preprocess a little bit but getting inconsistent output.

Current output:

                        Name                                              Price
0  Half-Life: Opposing Force  [{'id': 32, 'discount_block': '<div class="dis...
1                  Half-Life  [{'id': 34, 'discount_block': '<div class="dis...
2                Half-Life 2  [{'id': 36, 'discount_block': '<div class="dis...
3   Half-Life 2: Episode Two  [{'id': 516, 'discount_block': '<div class="di...
4                    Cuphead  [{'id': 35659, 'discount_block': '<div class="...
5           Steam Controller                                                 []
6                  PCMark 10  [{'id': 125001, 'discount_block': '<div class=...
7     Kerbal Space Program 2                                                 []
8    Hollow Knight: Silksong 

Expected Output:

               name           price
0 Half-Life: Opposing Force     59
1                  Half-Life    109
2                Half-Life 2    109
3   Half-Life 2: Episode Two    89
4                    Cuphead    573
5           Steam Controller    []# meaning None
6                  PCMark 10    157
7     Kerbal Space Program 2    []
8    Hollow Knight: Silksong    []

Script:

import re
import json
import requests
import pandas as pd

url = 'https://store.steampowered.com/wishlist/id/zorro4/#sort=order'
wishlist_url =  json.loads( re.findall(r'g_strWishlistBaseURL = (".*?");', requests.get(url).text)[0] )
#print(wishlist_url)

data = requests.get(wishlist_url + 'wishlistdata/?p=0').json()

#print(wishlist_url + 'wishlistdata/?p=0')

# jsn_data=json.dumps(data, indent=4)
# with open('da.json','w') as f:
#     f.write(jsn_data)
names = [d['name'] for d in data.values()]
# print(names)

out = list(map(lambda x: x['subs'], data.values()))
p=[]
for i in out:
    for t in i:
        q=t['price']
        p.append(q)
        #print(q)

df = pd.DataFrame(data=list(zip(names, out)), columns=['Name', 'Price'])
print(df)

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

iterate through data and you have data inside list you can extract
using index position and it gives html tag now use bs4 to find
specific price as of now i have taken discount_final_price in
output.

And append your data to lst so it will return list of values and
create df using pandas and give data as lst and column as you
want

import numpy as np
lst=[]
for key,value in data.items():
    try:
        name=value['name']
        price_data=value['subs'][0]['discount_block']
        soup=BeautifulSoup(price_data,"html.parser")
        price=soup.find("div",class_="discount_final_price").get_text().split(" ")[-1]
    except:
        price=np.nan
    lst.append([name,price])

Output:

                name          price
0   Half-Life: Opposing Force   39
1   Half-Life                   69
2   Half-Life 2                 69
3   Half-Life 2: Episode Two    59
4   Cuphead                     395
5   Steam Controller            NaN
6   PCMark 10                   104
7   Kerbal Space Program 2      NaN
8   Hollow Knight: Silksong     NaN
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading