48 Glow: Generative Flows with Invertible 1x1 Convolutions

Annotated Paper Link: Google Drive

This is my notes for the paper Kingma & Dhariwal (2018).

This paper introduces a new generative model called Glow, which uses invertible 1x1 convolutions.
Section 1 is a quick review of Generative Models, the families of generative models, and their weaknesses compared to normalizing flows.
Section 2 reviews Flow-based generative models
- Check Tomczak (2024) for a comprehensive overview of flow-based models
In section 3, the new flow, Glow, is introduced
- Consists of a series of flows
- Each step of a flow has 3 stages
  - Actnorm
  - Invertible 1x1 Convolutions
  - Affine Coupling Layers
Actnorm
- Applies a learned affine transformation to the input
- \(y_{i,j} = s \cdot x_{i,j} + b\)
Invertible 1x1 Convolutions
- These are used to mix the channels of the input
- The convolution is invertible, meaning we can recover the input from the output
- \(y_{i,j} = \mathbf{W} x_{i, j}\)
Affine Coupling Layers
- Same as in RealNVP
All of them have easily computable log determinants
They compare Glow with RealNVP on CIFAR-10, ImageNet, and LSUN. Glow outperforms RealNVP in terms of bits per dimension (BPD).
For qualitative results, they train the model on CelebA-HQ. It generates high-quality images and learns a continuous latent space as demonstrated by interpolation.