Deep Image Prior
These are the notes for Ulyanov et al. (2018)
- Main Point: The structure of ConvNets is sufficient to capture a great deal of low-level statistics before any training
- The study focuses on the prior captured by a deep convolutional network, independent of any training
- The Driving Reasoning
- ConvNets are SOTA for many image-related tasks (super-resolution, image reconstruction, denoising, etc.)
- They are usually trained on huge datasets.
- It can be assumed that large training datasets is the reason of the great performance, but learning isn’t a sufficient explanation
- Generalization requires the structure of the network to resonate with the structure of the data
- Their Method
- Basically, the authors fit a randomly initialized ConvNet on the noisy image and use it for the generation task.
- The Task
- They consider inverse tasks such as denoising, super-resolution, and inpainting.
- Expressed as energy-minimization problem: \(x^* = \min_x E(x; x_0) + R(x)\)
- \(E(x;x_0)\): Task dependendt data term (e.g. How similiar the reconstructed image is to the noisy one?)
- \(R(x)\) is a requalization term (e.g. the probability \(x\) occurs in nature as determined by the prior of a pretrained model)
- \(x_0\) is the noisy/low-resolution/occulded image
- \(x^*\) is the model’s predicted clean/high-resolution/inpainted image
- Deep networks are applied by mapping a random code \(z\) to an image \(x\): \(x = f_\theta (z)\)
- Their method
- Instead of finding the parameters by training on a large dataset, they learn to map \(z\) to the given \(x_0\)
- \(\theta^* = \arg\min_{\theta} E(f_\theta(z); x_0)\)
- and they set the regualizer to zero. Thus,
- \(x^* = f_{\theta ^ *}(z)\)
- Why it works?
- It is expected that their model learns the noise in \(x_0\)
- This doesn’t happen because the ConvNet architecture has high resistance to learning noise and low resistance to learning the signal
- => the model learns the signal before it learns the noise
- => They stop training early before the model learns the noise
- Applications
- They apply their model to multiple tasks including denoising, super-resolution, inpainting, etc.
- In all tasks, the model outperforms or is very close to the SOTA no-training models and is close to those that train on large datasets
- To Summarize, ConvNets are really good image priors reagardless of the training data