Here’s the problematic code:
import pandas as pd
here I create a sample src dictionary containing 3 dataframes:
src = {}
for i in range(1,4):
src[i] = pd.DataFrame({'a':[i, 2*i, 3*i], 'b':[10*i, 20*i, 30*i], 'c':[100*i, 200*i, 300*i]})
display(src[i])
here are 3 dataframes created in src dictionary:
a b c
0 1 10 100
1 2 20 200
2 3 30 300
a b c
0 2 20 200
1 4 40 400
2 6 60 600
a b c
0 3 30 300
1 6 60 600
2 9 90 900
here I want to append a column from each dataframe in src dictionary to a dataframe in output dictionary, and b column to b dataframe.
output = {}
for i in src:
output['a'] = output['a'].concat([output[i]['a'], src[i][a]], axis = 1)
output['b'] = output['b'].concat([output[i]['b'], src[i][b]], axis = 1)
I got this error message:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Input In [30], in <cell line: 11>()
10 # retrieve a column from all source dataframes, put them in a new dataframe. and these new dataframes are in a new dictionary.
11 for i in src:
---> 12 output['a'] = output['a'].concat([output[i]['a'], src[i][a]], axis = 1)
13 output['b'] = output['b'].concat([output[i]['b'], src[i][b]], axis = 1)
KeyError: 'a'
How can I fix it?
>Solution :
The problem is that on the first loop you do not have a key called 'a' (at this point output is an empty dictionary) – so define the keys at the definition of output – don’t make it an empty dictionary