I wanted to grab all the name and price keys at one time but all the first dict key aren’t the same order and I can’t iterate that’s why I have to preprocess a little bit but getting inconsistent output.
Current output:
Name Price
0 Half-Life: Opposing Force [{'id': 32, 'discount_block': '<div class="dis...
1 Half-Life [{'id': 34, 'discount_block': '<div class="dis...
2 Half-Life 2 [{'id': 36, 'discount_block': '<div class="dis...
3 Half-Life 2: Episode Two [{'id': 516, 'discount_block': '<div class="di...
4 Cuphead [{'id': 35659, 'discount_block': '<div class="...
5 Steam Controller []
6 PCMark 10 [{'id': 125001, 'discount_block': '<div class=...
7 Kerbal Space Program 2 []
8 Hollow Knight: Silksong
Expected Output:
name price
0 Half-Life: Opposing Force 59
1 Half-Life 109
2 Half-Life 2 109
3 Half-Life 2: Episode Two 89
4 Cuphead 573
5 Steam Controller []# meaning None
6 PCMark 10 157
7 Kerbal Space Program 2 []
8 Hollow Knight: Silksong []
Script:
import re
import json
import requests
import pandas as pd
url = 'https://store.steampowered.com/wishlist/id/zorro4/#sort=order'
wishlist_url = json.loads( re.findall(r'g_strWishlistBaseURL = (".*?");', requests.get(url).text)[0] )
#print(wishlist_url)
data = requests.get(wishlist_url + 'wishlistdata/?p=0').json()
#print(wishlist_url + 'wishlistdata/?p=0')
# jsn_data=json.dumps(data, indent=4)
# with open('da.json','w') as f:
# f.write(jsn_data)
names = [d['name'] for d in data.values()]
# print(names)
out = list(map(lambda x: x['subs'], data.values()))
p=[]
for i in out:
for t in i:
q=t['price']
p.append(q)
#print(q)
df = pd.DataFrame(data=list(zip(names, out)), columns=['Name', 'Price'])
print(df)
>Solution :
iterate through
dataand you have data insidelistyou can extract
usingindexposition and it giveshtmltag now usebs4to find
specific price as of now i have takendiscount_final_pricein
output.And append your data to
lstso it will return list of values and
createdfusingpandasand give data aslstand column as you
want
import numpy as np
lst=[]
for key,value in data.items():
try:
name=value['name']
price_data=value['subs'][0]['discount_block']
soup=BeautifulSoup(price_data,"html.parser")
price=soup.find("div",class_="discount_final_price").get_text().split(" ")[-1]
except:
price=np.nan
lst.append([name,price])
Output:
name price
0 Half-Life: Opposing Force 39
1 Half-Life 69
2 Half-Life 2 69
3 Half-Life 2: Episode Two 59
4 Cuphead 395
5 Steam Controller NaN
6 PCMark 10 104
7 Kerbal Space Program 2 NaN
8 Hollow Knight: Silksong NaN