Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

NumPy ndarray of ndarray of float64 not flattening

I have a pandas dataframe with a column called ‘corr’. Each row contains an ndarray of float64. The following code is giving me issues:

import pandas as pd
experimentDataFrame = pd.DataFrame({'corr': [np.array([1.0,2.0]),np.array([3.0,4.0]),np.array([5.0,6.0])]})
corr = experimentDataFrame['corr'].to_numpy(copy=True)
print ([type(corr), corr.shape])
print ([type(corr[0]), corr[0].shape])
print ([type(corr[0][0]), corr[0][0].shape])
corr = corr.flatten()
print ([type(corr), corr.shape])
print ([type(corr[0]), corr[0].shape])
print ([type(corr[0][0]), corr[0][0].shape])

The output of which is

[<class 'numpy.ndarray'>, (3,)]
[<class 'numpy.ndarray'>, (2,)]
[<class 'numpy.float64'>, ()]
[<class 'numpy.ndarray'>, (3,)]
[<class 'numpy.ndarray'>, (2,)]
[<class 'numpy.float64'>, ()]

I’ve also tried corr.ravel() and corr.reshape(-1) instead of flatten with no difference. And I’ve tried corr.reshape(6) but I get, ValueError: cannot reshape array of size 35 into shape (6,).

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

What I’m expecting is that after flattening, corr[0] should be a float64 and not still an ndarray. My strong suspicion is that since corr is an ndarray of ndarrays of unknown length, flatten (and the rest) doesn’t work. Is there a function that will work without iterating manually?

>Solution :

The problem is that experimentDataFrame['corr'].to_numpy(copy=True) is already flat, the shape is (35,). You have a dtype=object array.

You just want something like:

corr = np.concatenate([arr.ravel() for arr in experimentDataFrame['corr']])

Possibly, you can just do:

corr = np.concatenate(experimentDataFrame['corr'].tolist())

If all the inner arrays in your column are already flat. It isn’t clear that is the case from your question, but either of those should work.

EDIT:

And actually, you don’t need .tolist, just:

corr = np.concatenate(experimentDataFrame['corr']) 

works.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading