Skip to main content

Tensorflow Representation Learning

Guangzhou, China

Github

Auto encoders are a type of Artificial Neural Networks (ANN) that are used to perform data encoding (Representation Learning). Auto encoder use the same data for input data and output data:

  • Feed Forward ANN (Supervised Learning):
    • Cat Image => ANN <= True Label (Cat)
  • Auto Encoder ANN (Unsupervised Learning):
    • Cat Image => Encoder ANN/Decoder ANN => Reconstructed Cat Image => <= Cat Image

The encoding side of this network generates a representation of the input image (dimensionality reduction). The decoding side then takes the encoded image and tries to reconstruct the input image from it. The performance metric for an Auto Encoder is the similarity between the input and output image.

Denoising Autoencoder

A common use-case for autoencoders is the removal of noise from data, e.g. compression artifacts in low resolution images. This is done by feeding the autoencoder the compressed image and comparing the reconstructed output image to the original, non-compressed image. This training will give us a model that allows us to remove compression artifacts from similarly "mis-treated" images.

Inspecting the Dataset

Start by loading the mnist digits dataset - which is provided by Keras:

# download mnist dataset
from tensorflow.keras.datasets import mnist
(X_train, y_train),(X_test, y_test) = mnist.load_data()

We can inspect a random set of 225 images and their labels from the set using matplotlib:

# inspect a larger set of images
## create a 15x15 grid
W_grid = 15
L_grid = 15
fig, axes = plt.subplots(L_grid, W_grid)
fig, axes = plt.subplots(L_grid, W_grid, figsize = (17,17))
## flaten the 15 x 15 matrix into 225 array
axes = axes.ravel()
## get the length of the training dataset
n_training = len(X_train)

for i in np.arange(0, W_grid * L_grid):
# select a random number
index = np.random.randint(0, n_training)
# read and display an image with the selected index
axes[i].imshow( X_train[index] )
axes[i].set_title(y_train[index], fontsize = 8)
axes[i].axis('off')

plt.subplots_adjust(hspace=0.4)
plt.tight_layout()
plt.show()

Tensorflow Transfer Learning

Adding Noise

The dataset is clean of noise and can be used as our reference material. To create the "dirty" set of images we can use Numpy to add some random variations:

# create a 28x28 matrix of random number
# added_noise = np.random.randn(*(28,28))
noise_factor = 0.3
added_noise = noise_factor * np.random.randn(*(28,28))

plt.imshow(added_noise)
plt.show()

Tensorflow Transfer Learning

We can add noise to an existing image the same way:

# adding noise to our image
## how much noise do we need
noise_factor = 0.2
## select an image
sample_image = X_train[666]
## and add noise map to it
noisy_sample_image = sample_image + noise_factor * np.random.randn(*(28,28))

# previs noisy image
plt.imshow(noisy_sample_image, cmap="gray")
plt.show()

Tensorflow Transfer Learning

Now we can run a loop over both our training and testing dataset, add the same amount of random noise to each image and store those "dirty" datasets in variables:

# add noise to all images in training dataset
X_train_noisy = []
noise_factor = 0.2

for sample_image in X_train:
sample_image_noisy = sample_image + noise_factor * np.random.randn(*(28,28))
sample_image_noisy = np.clip(sample_image_noisy, 0., 1.)
X_train_noisy.append(sample_image_noisy)

# convert list to np array
X_train_noisy = np.array(X_train_noisy)


# add noise to all images in testing dataset
X_test_noisy = []
noise_factor = 0.4

for sample_image in X_test:
sample_image_noisy = sample_image + noise_factor * np.random.randn(*(28,28))
sample_image_noisy = np.clip(sample_image_noisy, 0., 1.)
X_test_noisy.append(sample_image_noisy)

# convert list to np array
X_test_noisy = np.array(X_test_noisy)

Build the Autoencoder

The autoencoder is based on a regular Keras sequential model and consists of the encoder - that is basically the same convolution layer used in regular CNNs with supervised learning - and an decoder - that is an "encoder in reverse". The decoder takes the representation of the image generated by the encoder from the representation layer and has to reconstruct the initial image from this compress data:

# build autoencoder model
autoencoder = tf.keras.models.Sequential()

# build the encoder CNN
autoencoder.add(tf.keras.layers.Conv2D(16, (3,3), strides=1, padding="same", input_shape=(28, 28, 1)))
autoencoder.add(tf.keras.layers.MaxPooling2D((2,2), padding="same"))

autoencoder.add(tf.keras.layers.Conv2D(8, (3,3), strides=1, padding="same"))
autoencoder.add(tf.keras.layers.MaxPooling2D((2,2), padding="same"))

# representation layer
autoencoder.add(tf.keras.layers.Conv2D(8, (3,3), strides=1, padding="same"))

# build the decoder CNN
autoencoder.add(tf.keras.layers.UpSampling2D((2, 2)))
autoencoder.add(tf.keras.layers.Conv2DTranspose(8,(3,3), strides=1, padding="same"))
autoencoder.add(tf.keras.layers.UpSampling2D((2, 2)))
autoencoder.add(tf.keras.layers.Conv2DTranspose(1, (3,3), strides=1, activation='sigmoid', padding="same"))


autoencoder.compile(loss='binary_crossentropy', optimizer=tf.keras.optimizers.Adam(lr=0.001))
autoencoder.summary()

Train the Autoencoder

For the training we now feed the generated noisy dataset into the encoder and have the autencoder compress and reconstruct each image. After those steps we provide to original - noise-free - image to compare the reconstructed image to. The training is complete when we reach a minimum of differences between both of them:

autoencoder.fit(X_train_noisy.reshape(-1, 28, 28, 1),          
X_train.reshape(-1, 28, 28, 1),
epochs=10,
batch_size=200)

Evaluate the Model

To get an idea how well our model is performing we can take the first 15 images from the noisy test dataset and compare these source images with the predicted de-noised image:

# test training
# take 15 images from noisy test set and predict de-noised state
denoised_images = autoencoder.predict(X_test_noisy[:15].reshape(-1, 28, 28, 1))
# plot noisy input vs denoised output
fig, axes = plt.subplots(nrows=2, ncols=15, figsize=(30,6))
for images, row in zip([X_test_noisy[:15], denoised_images], axes):
for img, ax in zip(images, row):
ax.imshow(img.reshape((28, 28)), cmap='gray')

plt.show()

Tensorflow Transfer Learning