[UDL Study Notes] Ch 16 - Normalizing Flows

한창훈
October 29, 2025

[UDL Study Notes] Ch 16 - Normalizing Flows

Use Original Cover Image
Type
Post
Children
Language
en
Tags
Deep Learning
VAE
Normalizing Flow
DensityEstimation
GLOW
Authors
한창훈
Published
October 29, 2025

Overview

This series of posts are study notes documenting my progress through the book "Unerstanding Deep Learning".
This post covers Chapter 20, Why does deep learning work?

1. Likelihood and Invertibility

Chapter 16 focused on Normalizing Flows, emphasizing the concept of transforming probability distributions in an invertible manner.
Unlike the previous chapter on VAE, where distributions were modeled through approximation, Flows allow for the computation of an exact likelihood, which was particularly impressive.
Specifically, complex data distributions such as Pr(\boldsymbol{x} | \boldsymbol{\phi}) can be modeled directly from a simple base distribution p(\boldsymbol{z}) through a transformation f(\boldsymbol{z}, \boldsymbol{\phi}).
However, every layer must be invertible, and the Jacobian determinant must be computed efficiently — a nontrivial constraint.
This structural restriction, however, gives Flow models a sense of mathematical completeness and stability that is often lacking in other generative models.

2. Residual Flows and Contraction Mapping

In the section on Residual Flows, the application of the Banach Fixed Point Theorem to guarantee convergence during training was particularly intriguing.
While most neural networks rely on empirical stability, here the convergence condition is expressed mathematically through the Lipschitz constant (< 1), which ensures stability.
In the equation
y = z + f[z],
if f is a contraction mapping, repeated application will always converge to a single fixed point.
This provides strong intuition for why Residual Flows can maintain high representational power without diverging — a balance between expressivity and stability.

3. GLOW and Image Synthesis

The section on GLOW demonstrated the practical power of Flow-based models.
While GLOW resembles GANs in that it can generate realistic images, its strength lies in being a probabilistically interpretable generative model.
It processes 256×256×3 image tensors using Coupling Layers and 1×1 Convolutions, progressively reducing resolution through a multi-scale architecture.
In the latent space, interpolation between two encoded faces results in smooth, natural transitions — a clear display of the model’s invertibility.
However, GLOW-generated samples are slightly lower in perceptual quality compared to GANs.
This trade-off seems reasonable given the structural constraints imposed by invertibility, which prioritize mathematical precision over visual sharpness.

4. The Fundamental Significance of Flow

Ultimately, Flow represents an attempt to model data in a mathematically exact manner.
It performs both probabilistic computation and sample generation simultaneously, aiming to overcome the instability of GANs and the approximation limits of VAEs.
Whereas other generative models rely on heuristics to produce “good” samples,
Normalizing Flows generate data through explicit probability computation
pursuing a direction closer to mathematically proving what “good” means rather than intuitively approximating it.

Reference

[1] Prince, S. J. D. (2023). Understanding Deep Learning. The MIT Press. Retrieved from http://udlbook.com