Skip to content

Autoencoders

Ishani Kathuria edited this page Dec 25, 2022 · 2 revisions

Overview

Autoencoders (AE) are a specific type of feedforward neural network where the input is the same as the output.

Autoencoders compress the input into a lower-dimensional code and then reconstruct the output from this representation. The code is a compact “summary” or “compression” of the input, also called the latent-space representation.

An autoencoder consists of 3 components: encoder, code and decoder. The encoder compresses the input and produces the code, the decoder then reconstructs the input only using this code.

Autoencoders are mainly dimensionality reduction (or compression) algorithms with a couple of important properties:

  • Data-specific: Autoencoders are only able to meaningfully compress data similar to what they have been trained on. Since they learn features specific to the given training data, they are different from a standard data compression algorithm like gzip. So we can’t expect an autoencoder trained on handwritten digits to compress landscape photos.

  • Lossy: The output of the autoencoder will not be exactly the same as the input, it will be a close but degraded representation. If you want lossless compression they are not the way to go.

  • Unsupervised: To train an autoencoder we don’t need to do anything fancy, just throw the raw input data at it. Autoencoders are considered an unsupervised learning technique since they don’t need explicit labels to train on. But to be more precise they are self-supervised because they generate their own labels from the training data.

Table of Contents

  1. Architecture
  2. Types of AE
    1. Variational autoencoder
    2. Denoising autoencoder
    3. Sparse autoencoder

Architecture

  • Both the encoder and decoder are fully-connected feedforward neural networks, essentially the ANNs. Code is a single layer of ANN with the dimensionality of our choice.
  • First, the input passes through the encoder, which is a fully-connected ANN, to produce the code.
  • The decoder, which has a similar ANN structure, then produces the output only using the code.
  • The goal is to get an output identical to the input.
  • Note that the decoder architecture is the mirror image of the encoder. This is not a requirement but it’s typically the case. The only requirement is the dimensionality of the input and output needs to be the same.
  • There are 4 hyperparameters that we need to set before training an autoencoder:
    • Code size: number of nodes in the middle layer. Smaller size results in more compression.
    • Number of layers: the autoencoder can be as deep as we like. In the figure above we have 2 layers in both the encoder and decoder, without considering the input and output.
    • Number of nodes per layer: the autoencoder architecture we’re working on is called a stacked autoencoder since the layers are stacked one after another. Usually stacked autoencoders look like a “sandwich”. The number of nodes per layer decreases with each subsequent layer of the encoder and increases back in the decoder. Also, the decoder is symmetric to the encoder in terms of the layer structure. As noted above this is not necessary and we have total control over these parameters.
    • Loss function: we either use mean squared error (MSE) or binary cross entropy. If the input values are in the range [0, 1] then we typically use cross entropy, otherwise, we use the mean squared error.
  • Autoencoders are trained the same way as ANNs via backpropagation.

Types of AE

Variational autoencoder

Variational autoencoders (VAEs) are a type of generative model that extends the basic autoencoder architecture by introducing a latent space.

The encoder maps the input data to a latent space, and the decoder maps the latent space back to the original data space.

The key difference between VAEs and regular autoencoders is that the latent space in VAEs is continuous and is typically assumed to follow a specific distribution, such as a normal distribution. This allows VAEs to generate new data by sampling from the latent space and running it through the decoder.

Denoising autoencoder

Denoising autoencoders are a type of autoencoder that is trained to reconstruct the original input data from a corrupted version of the data.

The goal of a denoising autoencoder is to learn a representation of the data that is robust to noise, and it is often used for tasks such as image denoising or anomaly detection.

Sparse autoencoder

Sparse autoencoders are a type of autoencoder that is trained to enforce sparsity in the hidden layer activations.

The goal of a sparse autoencoder is to learn a compact representation of the data that uses only a small number of hidden units.

This can be useful for tasks such as feature learning or dimensionality reduction, as the learned representation is typically more interpretable and efficient than a dense representation.

Clone this wiki locally