Advertisements
Is there a pure numpy way that I can use to get to this expected outcome?
Right now I have to use Pandas and I would like to skip it.
import pandas as pd
import numpy as np
listOfDicts = [{'key1': np.array(10), 'key2': np.array(10), 'key3': np.array(44)},
{'key1': np.array(2), 'key2': np.array(15), 'key3': np.array(22)},
{'key1': np.array(25), 'key2': np.array(25), 'key3': np.array(11)},
{'key1': np.array(35), 'key2': np.array(55), 'key3': np.array(22)}]
Use pandas to parse:
# pandas can unpack simply
df = pd.DataFrame(listOfDicts)
# get all values under the same key
xd = df.to_dict('list')
# ultimate goal
np.stack([v for k, v in xd.items() if k not in ['key1']], axis=1)
array([[10, 44],
[15, 22],
[25, 11],
[55, 22]])
# I would like listOfDicts to transform temporarily into this with pure numpy,
# from which I could do basically anything to it:
{'key1': [np.array([10, 2, 25, 35])],
'key2': [np.array([10, 15, 25, 55])],
'key3': [np.array([44, 22, 11, 22])]
}
>Solution :
One way to turn your dataframe into a dictionary of numpy arrays, is to transpose it and use DataFrame.agg()
to merge the columns:
import numpy as np
import pandas as pd
listOfDicts = [{'key1': np.array(10), 'key2': np.array(10), 'key3': np.array(44)},
{'key1': np.array(2), 'key2': np.array(15), 'key3': np.array(22)},
{'key1': np.array(25), 'key2': np.array(25), 'key3': np.array(11)},
{'key1': np.array(35), 'key2': np.array(55), 'key3': np.array(22)}]
df.transpose().agg(np.stack, axis=1).to_dict()
# {'key1': array([10, 2, 25, 35]),
# 'key2': array([10, 15, 25, 55]),
# 'key3': array([44, 22, 11, 22])}
If you just want the values, you can pull those out and stack them and then slice and dice with numpy:
np.stack(df.transpose().agg(np.stack, axis=1).values)
# array([[10, 2, 25, 35],
# [10, 15, 25, 55],
# [44, 22, 11, 22]])