Stack arrays on specified dimension with arbitrary dimension size

Advertisements

Consider the following data:

data = np.array([[i for i in range(3)] for _ in range(9)])
print(data)
print(f'data has shape {data.shape}')

[[0 1 2]
 [0 1 2]
 [0 1 2]
 [0 1 2]
 [0 1 2]
 [0 1 2]
 [0 1 2]
 [0 1 2]
 [0 1 2]]
data has shape (9, 3)

And some parameter, let’s call it history. The functionality of history is, that it stacks history many arrays [0 1 2] on the first dimension. As an example, consider 1 iteration of that process with history=2

history = 2
data = np.array([[[0, 1, 2], [0, 1, 2]]])
print(f'data has now shape {data.shape}')
data has now shape (1, 2, 3)

Now, let’s consider 2 iterations:

history = 2
data = np.array([[[0, 1, 2], [0, 1, 2]],[[0, 1, 2], [0, 1, 2]]])
print(f'data has now shape {data.shape}')
data has now shape (2, 2, 3)

This process should be repeated, until the data is fully processed. That implies, that we might lose some data at the end, because data.shape[0]/history % 2 != 0.
The final result for history=2 would thus be

     ([[[0, 1, 2],
        [0, 1, 2]],

       [[0, 1, 2],
        [0, 1, 2]],

       [[0, 1, 2],
        [0, 1, 2]],

       [[0, 1, 2],
        [0, 1, 2]]])

How can this be done performant?

>Solution :

If I understand correctly, you can slice, then reshape:

history = 2

out = data[:data.shape[0]//history*history].reshape((-1, history, data.shape[1]))

Output:

array([[[0, 1, 2],
        [0, 1, 2]],

       [[0, 1, 2],
        [0, 1, 2]],

       [[0, 1, 2],
        [0, 1, 2]],

       [[0, 1, 2],
        [0, 1, 2]]])

Leave a ReplyCancel reply