References
Behrmann, J., Grathwohl, W., Chen, R. T. Q., Duvenaud, D., &
Jacobsen, J.-H. (2019). Invertible residual networks. In K. Chaudhuri
& R. Salakhutdinov (Eds.), Proceedings of the 36th international
conference on machine learning (Vol. 97, pp. 573–582). PMLR. https://proceedings.mlr.press/v97/behrmann19a.html
Bengio, Y., & Bengio, S. (1999). Modeling
High-Dimensional Discrete
Data with Multi-Layer
Neural Networks. Advances in
Neural Information Processing
Systems, 12. https://proceedings.neurips.cc/paper_files/paper/1999/hash/e6384711491713d29bc63fc5eeb5ba4f-Abstract.html
Kingma, D. P., & Dhariwal, P. (2018). Glow: Generative flow with
invertible 1x1 convolutions. In S. Bengio, H. Wallach, H. Larochelle, K.
Grauman, N. Cesa-Bianchi, & R. Garnett (Eds.), Advances in
neural information processing systems (Vol. 31). Curran Associates,
Inc. https://proceedings.neurips.cc/paper_files/paper/2018/file/d139db6a236200b21cc7f752979132d0-Paper.pdf
Kingma, D. P., Salimans, T., Jozefowicz, R., Chen, X., Sutskever, I.,
& Welling, M. (2016). Improved Variational
Inference with Inverse
Autoregressive Flow. Advances in
Neural Information Processing
Systems, 29. https://proceedings.neurips.cc/paper_files/paper/2016/hash/ddeebdeefdb7e7e7a697e1c3e3d8ef54-Abstract.html
Leskovec, J. (2023, December 7). Graph neural networks.
Stanford. https://www.youtube.com/watch?v=ZfK4FDk9uy8
Li, T., Tian, Y., Li, H., Deng, M., & He, K. (2024).
Autoregressive image generation without vector quantization. https://arxiv.org/abs/2406.11838
Oord, A. van den, Kalchbrenner, N., & Kavukcuoglu, K. (2016).
Pixel recurrent neural networks. https://arxiv.org/abs/1601.06759
Oord, A. van den, Vinyals, O., & Kavukcuoglu, K. (2018). Neural
discrete representation learning. https://arxiv.org/abs/1711.00937
Rezende, D., & Mohamed, S. (2015). Variational inference with
normalizing flows. In F. Bach & D. Blei (Eds.), Proceedings of
the 32nd international conference on machine learning (Vol. 37, pp.
1530–1538). PMLR. https://proceedings.mlr.press/v37/rezende15.html
Tian, K., Jiang, Y., Yuan, Z., Peng, B., & Wang, L. (2024).
Visual autoregressive modeling: Scalable image generation via
next-scale prediction. https://arxiv.org/abs/2404.02905
Tomczak, J. M. (2024). Deep Generative
Modeling. Springer International Publishing. https://doi.org/10.1007/978-3-031-64087-2
Ulyanov, D., Vedaldi, A., & Lempitsky, V. (2018). Deep image prior.
Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition (CVPR).
Yu, L., Cheng, Y., Sohn, K., Lezama, J., Zhang, H., Chang, H.,
Hauptmann, A. G., Yang, M.-H., Hao, Y., Essa, I., & Jiang, L.
(2023). MAGVIT: Masked generative video transformer. https://arxiv.org/abs/2212.05199
Yu, L., Lezama, J., Gundavarapu, N. B., Versari, L., Sohn, K., Minnen,
D., Cheng, Y., Birodkar, V., Gupta, A., Gu, X., Hauptmann, A. G., Gong,
B., Yang, M.-H., Essa, I., Ross, D. A., & Jiang, L. (2024).
Language model beats diffusion – tokenizer is key to visual
generation. https://arxiv.org/abs/2310.05737
Yu, Q., Weber, M., Deng, X., Shen, X., Cremers, D., & Chen, L.-C.
(2024). An image is worth 32 tokens for reconstruction and
generation. https://arxiv.org/abs/2406.07550
Zhou, C., Yu, L., Babu, A., Tirumala, K., Yasunaga, M., Shamis, L.,
Kahn, J., Ma, X., Zettlemoyer, L., & Levy, O. (2024).
Transfusion: Predict the next token and diffuse images with one
multi-modal model. https://arxiv.org/abs/2408.11039
Zoran, D., & Weiss, Y. (2011). From learning models of natural image
patches to whole image restoration. 2011 International Conference on
Computer Vision, 479–486. https://doi.org/10.1109/ICCV.2011.6126278
Zoran, D., & Weiss, Y. (2012). Natural images, gaussian mixtures and
dead leaves. In F. Pereira, C. J. Burges, L. Bottou, & K. Q.
Weinberger (Eds.), Advances in neural information processing
systems (Vol. 25). Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2012/file/e97ee2054defb209c35fe4dc94599061-Paper.pdf