I have a numpy array of shape (1684, 129, 522). Basically 1684 frames of dimensions 129X522 (only 1 channel so I have not specified it in the array.)
I was writing a function that would take 4 of these frames (each of 129 X 522) at a time and create a new input numpy array of size (4,129,522).
Hence, the net result would be a numpy array of shape (1684 X 4 X 129 X 522) from an original array shape of (1684 X 129 X 522)
def create_frame_windows(episode, frame_window_length=4): episode_length, dim1, dim2=episode.shape new_episode=np.zeros((episode_length,frame_window_length,dim1, dim2)) data_q_deque=deque(maxlen=4) for _ in range(frame_window_length): data_q_deque.append(np.zeros((dim1, dim2))) data_q=np.array(data_q_deque) print('Initial data queue',data_q.shape) for frame_no in range(len(episode)): frame=episode[frame_no] data_q[:-1]=data_q[1:]; data_q[-1]=frame new_episode[frame_no]=data_q print('New episode length',new_episode.shape) return new_episode
Run the function:
episode=np.load(os.path.join(paths.INPUT_DATA_PATH,epi_file)) print('Episode shape',episode.shape) print('Initial size',sys.getsizeof(episode)) final_episode=create_frame_windows(episode,4) print('Final episode shape',final_episode.shape) print('Final size',sys.getsizeof(final_episode))
Episode shape (1684, 129, 522) Initial size 113397336 Final episode shape (1684, 4, 129, 522) Final size 3628710304
My issue is that while the shape of the episodes are as expected, the size of the final episode array is 32X the size of the original episode array (3628710304 / 113397336 = 31.99). For just an increase of 4X increase in the number of elements of the array.
Have I written the function wrong or is there a more logical explanation for why this is happening? i.e. a 32X increase in numpy size (on disk) for a 4X increase in the number of elements
It’s likely that the original array consisted of integers.
np.zeros by default creates floats, which are larger.
You can pass a datatype to