I am reading from a multi nested json file. I have created a class for getting specific keys from json and then create it into a dataframe.
Now if the keys are present everything works as expected and when the key is not found, I get the following error
TypeError: ‘NoneType’ object is not subscriptable
Below is a sample json and the code snapshot
output = {"records":[{
"ID":{
"$":"123456"
},
"Records":{
"ChildRecord":{
"Address":[
{
"$": "24 Street"
}
]
}
}
},{
"ID":{
"$":"456789"
},
"Records":{
"ChildRecord":{
"Address":[
{
"$": "57 New Town"
}
]
}
}
}]}
identifier = [o.get('ID').get('$') for o in output['records']]
address = [o.get('Records').get('ChildRecord').get('Address')[0].get('$') for o in output['records']]
print(identifier,'identifier')
print(address,'address')
Since the keys are present I get below output
identifier ['123456','456789']
address ['24 Street','57 New Town']
If the "Address" key is not present in one of the record then I get the type error, TypeError: ‘NoneType’ object is not subscriptable.
How to overcome this issue and get the output as
identifier ['123456','456789']
address ['24 Street',None]
Also, in some situation I get the address in multiple keys
"Address":[
{
"$": "57 New Town"
},
{
"$": "New Castle Road"
},
{
"$": "PO Box 309"
}
]
Is there a way to read the entire address as
['24 Street','57 New Town, New Castle Road, PO Box 309']
>Solution :
The issue is in this part: .get('Address')[0] – when Address is not present, the .get() returns None and exception is thrown.
Here is slightly modified version that returns None if Address is not present (and not throw an error) OR joins the multiple addresses (if they are present):
output = {
"records": [
{
"ID": {"$": "123456"},
"Records": {"ChildRecord": {"Address": [{"$": "24 Street"}]}},
},
{
"ID": {"$": "456789"},
"Records": {"ChildRecord": {}},
},
]
}
identifier = [o.get("ID").get("$") for o in output["records"]]
address = [
", ".join(
d["$"] for d in o.get("Records", {}).get("ChildRecord", {}).get("Address", [])
)
or None
for o in output["records"]
]
print(identifier, "identifier")
print(address, "address")
Prints:
['123456', '456789'] identifier
['24 Street', None] address
It the output is
output = {
"records": [
{
"ID": {"$": "123456"},
"Records": {"ChildRecord": {"Address": [{"$": "24 Street"}]}},
},
{
"ID": {"$": "456789"},
"Records": {
"ChildRecord": {
"Address": [{"$": "57 New Town"}, {"$": "New Castle Road"}]
}
},
},
]
}
Then the code prints:
['123456', '456789'] identifier
['24 Street', '57 New Town, New Castle Road'] address