I’m having trouble with some simple functionality of Pandas. I read a df from a json file and when I do
df.iloc[0]['someprop']['someotherprop']
it all works. When I do
df.iloc[0:1]['someprop']['someotherprop']
I expect it to work same as before but instead I get KeyError: 'someotherprop'
Interestingly df.iloc[0:1]['someprop'] works for some reason. This is quite confusing.
Can someone please tell me, how do I access the second property? I want to do:
df['someprop']['someotherprop'].unique()
at the end.
>Solution :
The issue is that df.iloc[0] returns a Series, where df.iloc[0:1] returns a DataFrame:
type(df.iloc[0])
# <class 'pandas.core.series.Series'>
type(df.iloc[0:1])
# <class 'pandas.core.frame.DataFrame'>
Both can then be indexed by someprop, with the first returning a value (presumably a dict in your case), while the second returns a Series. So to access the someotherprop value in the second case, you need to first index into the series i.e.:
df.iloc[0:1]['someprop'][0]['someotherprop']
To get all the unique values, you’d need to iterate them for example:
unq = set(d['someotherprop'] for d in df.iloc[0:1]['someprop'])
Or for the whole dataframe:
unq = set(d['someotherprop'] for d in df['someprop'])