Appending a series of 2d numpy array to create a 3d numpy array

Advertisements

I thought this was an easy problem, but I have been struggling with it.

I have a dataframe with 4 columns (Open, High, Low, Close).

I need to iteratively select

  • for 100 times
  • a batch of 75 rows
  • each having 4 columns.
    Such that the final shape is (100,75,4)

I have tried np.append, np.stack, np.dstack, np.concatenate. None of it works.

In np.append i get a shape (7500,4)

In np.stack in the second iteration there is error that all input arrays must have the same shape (since after first stack the original arrays shape is different).

My last code with np.stack (not put other attempts):

for i in range (100):
  print(i)
  if (i==0):
    temp_array=timeseries[['Open','High','Low','Close']].iloc[i:i+75].to_numpy()
  else:
    temp_temp_array=np.stack([temp_array,timeseries[['Open','High','Low','Close']].iloc[i:i+75].to_numpy()])
    temp_array=temp_temp_array
  print(temp_array.shape)

It seems stackoverflow/internet does not have an answer (or may be I am not asking the right questions).

>Solution :

As suggested by Quang, you can use strides here for speed and memory (!) efficiency:

X = df.values
rolling_X = np.lib.stride_tricks.as_strided(X, shape=(X.shape[0],75,X.shape[1]), strides=(X.strides[0], X.strides[0], X.strides[1]))

Leave a ReplyCancel reply