Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to get numpy to interpret object array slice as single array?

When you ask Numpy to make an array out of a collection including arbitrary objects, it will create an array of "object" type, which allows you to use index slicing across those objects, but since the object itself is unknown to numpy, you cannot index into the object in one go (even if that particular object is actually a numpy array).

However, if you slice into the object array to select the parts of the object array that are actually numpy arrays, it seems that numpy won’t collapse that slice into a single numpy array, even with another call to np.array(). Here is a little example of what I mean:

>>> aa = np.array([np.random.randn(3, 4), {'something': 'blah'}], dtype=object)
>>> aa.shape
(2,)
>>> np.array(aa[0:1])
array([array([[ 1.78237043, -0.61082005,  0.92160137,  0.58961677],
              [ 1.54183639, -0.43097464,  1.36213935, -1.2695875 ],
              [ 0.01431181, -0.62073519,  0.56267489, -0.46113538]])],
      dtype=object)
>>> np.array(aa[0:1]).shape # I want this to be (1, 3, 4)
(1,)

Is there any way to do this without a double copy (e.g. not like this: np.array(aa[0:1].tolist()))? Does an object array even allow you to do this without such a copy?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You can use np.stack to combine the object-type array to a normal ndarray:

>>> aa = np.array([np.random.randn(3, 4), {'something': 'blah'}], dtype=object)
>>> aa
array([array([[-6.36267204e-01,  8.95707498e-02,  1.09275216e+00,
               -3.70594544e-01],
              [ 8.32865823e-01, -6.53876690e-01,  1.21000457e+00,
                1.22046398e+00],
              [-5.30262118e-01,  1.17934947e-04,  4.45156002e-01,
               -6.61549444e-02]])                                ,
       {'something': 'blah'}], dtype=object)
>>> np.stack(aa[0:1])
array([[[-6.36267204e-01,  8.95707498e-02,  1.09275216e+00,
         -3.70594544e-01],
        [ 8.32865823e-01, -6.53876690e-01,  1.21000457e+00,
          1.22046398e+00],
        [-5.30262118e-01,  1.17934947e-04,  4.45156002e-01,
         -6.61549444e-02]]])
>>> np.stack(aa[0:1]).shape
(1, 3, 4)

This also works with multiple ndarrays in your object-array, as long as they have compatible sizes.

Internally, this just treats the object-array as a sequence and iterates over it. I’m not sure if it has a significant performance benefit over your solution with np.array(aa[0:1].tolist()).

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading