Colorization Autoencoders using Keras
This article gives a practical use-case of Autoencoders, that is, colorization of gray-scale images. We will use Keras to code the autoencoder.
As we all know, that an AutoEncoder has two main operators:
Encoder This transforms the input into low-dimensional latent vector.As it reduces dimension, so it is forced to learn the most important features of the input.
Decoder: This tries to reconstruct the input as much as possible from the latent vector.
During the design of Autoencoder, it is very essential to correctly choose a latent dimension. As if it is
more than the input dimension, Autoencoder tends to memorize the input. We will implement the Encoder part using CNNs and will use Conv2DTranspose for the decoder section of the autoencoder.
The dataset contains 50k colour images of shape 32 * 32 * 3 for training, and 10k colour images of the same shape for testing purpose.
Code: Import all the libraries
As the dataset contain only coloured images, so for the purpose of our task we need to change it to grey-scale. We hence define a function for that.
Code: Function to convert RGB images to Grayscale
Code: Load the dataset
For the model to learn efficiently, it is better to convert the images into float. We also need to normalize the values so that they lie between 0 and 1. This is done so that during back-propagation, the gradients don’t go out of control.
Code: Normalize the data
Performance of Deep Learning models very much rely on the set of hyper-parameters (including no. of layer, no. of filters in each layer, batch size etc.). So a good choice of the hyper-parameters is an essential skill. For the best results, we need to try and experiment with a different set of them. Here, we are using these sets of hyper-parameters,
For the task of colourizing, the input is a grey-scale image. Gray-scale image has only 1 channel as compared to colour images which have 3 namely Red, Green, Blue. We use Input from Keras library to take an input of the shape of (rows, cols, 1).
The Encoder is a stack of 3 Convolutional Layers with an increasing number of filters, followed by a Dense layer with 256 units for generating latent vectors.
The decoder section of the Autoencoder tries to decompress the latent vector in order to the input. In our case, the input to the Decoder is a layer of shape (None, 256). It follows a stack of three DeConvolutional layers with decreasing filter numbers in each layer. We make sure that the last layer, in this case, should be of shape (None, 32, 32, 3). The number of channels should be 3 so as to compare the reconstruction with the ground truth of the images during the back-propagation.
It is not mandatory, that the Encoder and Decoder should be a mirror image of the two.
Finally, we define the model, named autoencoder which takes an input and then passes it to the encoder followed by passing it through the decoder.
We now train the autoencoder model by slicing the entire data into batches of batch size = batch_size, for 30 epochs. The important point to note here is that, if we check out the of fit function, we find that, the input to the model is the dataset of grayscale images and the corresponding colour image is serving as the label. A similar thing happens for the validation set as well.
Generally, for the classification task, we feed the images to the model as inputs, and their respective classes are given as labels. and during training, we compare the performance of the model by measuring how well it classifies the images into their respective classes given as labels. But, for this task, we provide the colour images as the labels as we want the model to provide the RGB images as outputs when w provide a grey-scale image to it.
We have also used callbacks to reduce the learning rate if the validation loss is not decreasing much.
Code: Results and analysis
The autoencoder has an acceptable performance in the colourization job. It predicted correctly that the sky is blue, chimps have varying shades of brown, leaves are green etc. But also, it does some wrong predictions as well, like Sunflower has some shades of grey in it, orange has no colour predicted, the mushroom is dark and not reddish etc.