Refer to this Complete guide on How to use Autoencoders in Python
Notice the author add:
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))
after they loaded the MNIST data.
Why do they divide the image data by 255? And why 255? After that why do they reshape a 2d matrix into 1d?
Thank you so much!
>Solution :
- Why dividing by 255:
The RGB value is of values up to 255 and you want to standardize your colors between 0 and 1.
Then why the transformation to a 1D vector is to easily send the whole vector into a model. If you have a 2D vector you will have to use other forms of input layers or different kinds of models which are built especially for this. In many cases a 2D vector can be indicative of timeseries datasets which I actually do not know if there are CNN implementations which may use 2D inputs for images.