Advertisements

I thought this was an easy problem, but I have been struggling with it.

I have a dataframe with 4 columns (Open, High, Low, Close).

I need to iteratively select

- for 100 times
- a batch of 75 rows
- each having 4 columns.

Such that the final shape is (100,75,4)

I have tried `np.append, np.stack, np.dstack, np.concatenate`

. None of it works.

In `np.append`

i get a shape (7500,4)

In `np.stack`

in the second iteration there is error that all input arrays must have the same shape (since after first stack the original arrays shape is different).

My last code with `np.stack`

(not put other attempts):

```
for i in range (100):
print(i)
if (i==0):
temp_array=timeseries[['Open','High','Low','Close']].iloc[i:i+75].to_numpy()
else:
temp_temp_array=np.stack([temp_array,timeseries[['Open','High','Low','Close']].iloc[i:i+75].to_numpy()])
temp_array=temp_temp_array
print(temp_array.shape)
```

It seems stackoverflow/internet does not have an answer (or may be I am not asking the right questions).

### >Solution :

As suggested by Quang, you can use strides here for speed and memory (!) efficiency:

```
X = df.values
rolling_X = np.lib.stride_tricks.as_strided(X, shape=(X.shape[0],75,X.shape[1]), strides=(X.strides[0], X.strides[0], X.strides[1]))
```