Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Convert pandas Dataframe to 3D numpy matrix

I have a pandas.DataFrame as given below you can read it as pd.DataFrame(data_dict):

data_dict = 
{'Elevation': {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1},
 'Azimuth': {0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 0, 6: 1, 7: 2, 8: 3, 9: 4},
 'median': {0: 255,
  1: 255,
  2: 255,
  3: 255,
  4: 255,
  5: 256,
  6: 256,
  7: 256,
  8: 256,
  9: 256},
 'count': {0: 255,
  1: 255,
  2: 255,
  3: 255,
  4: 255,
  5: 250,
  6: 250,
  7: 250,
  8: 250,
  9: 250},
 'to_drop': {0: 1,
  1: 1,
  2: 1,
  3: 1,
  4: 1,
  5: 0,
  6: 0,
  7: 0,
  8: 0,
  9: 0}}

I want to convert it to a 3D matrix in numpy. The shape of the 3D matrix would be [Azimuth.nunique(), Elevation.nunique(), 3(Median,count,to_drop)] i.e, [5,2,3].

I have tried data.groupby(['Elevation','Azimuth']).apply(lambda x: x.values).reset_index().values that results in 10,3 array. How to get 5,2,3 array?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

First pivoting by DataFrame.pivot with DataFrame.stack and then convert MultiIndex DataFrame to 3d array – first with DataFrame.to_xarray:

out = (df.pivot(index='Elevation', columns='Azimuth').stack(level=0, future_stack=True)
         .to_xarray().to_array())
print (out)
<xarray.DataArray (variable: 5, Elevation: 2, level_1: 3)>
array([[[255, 255,   1],
        [250, 256,   0]],

       [[255, 255,   1],
        [250, 256,   0]],

       [[255, 255,   1],
        [250, 256,   0]],

       [[255, 255,   1],
        [250, 256,   0]],

       [[255, 255,   1],
        [250, 256,   0]]], dtype=int64)
Coordinates:
  * Elevation  (Elevation) int64 0 1
  * level_1    (level_1) object 'count' 'median' 'to_drop'
  * variable   (variable) int32 0 1 2 3 4

print (out.shape)
(5, 2, 3)

Another idea is use numpy.reshape and numpy.transpose:

df1 = df.pivot(index='Elevation', columns='Azimuth').stack(level=0, future_stack=True)

out = df1.to_numpy().reshape(*df1.index.levshape,-1).transpose(2, 0, 1)
print (out)
[[[255 255   1]
  [250 256   0]]

 [[255 255   1]
  [250 256   0]]

 [[255 255   1]
  [250 256   0]]

 [[255 255   1]
  [250 256   0]]

 [[255 255   1]
  [250 256   0]]]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading