Question 1

How does it work?

Accepted Answer

Diffusion models work by first defining a forward process that gradually adds Gaussian noise to data over many timesteps until it becomes pure noise. A neural network is then trained to reverse this process, predicting the noise added at each step. During generation, the model starts from random noise and iteratively denoises it to produce a new sample.

Question 2

How is a diffusion model different from a GAN?

Accepted Answer

Unlike GANs, which use a generator and discriminator in adversarial training, diffusion models learn a denoising process directly. This makes diffusion models more stable to train and less prone to mode collapse, but sampling is typically slower because it requires many sequential steps. GANs can generate samples in one forward pass but are harder to train.

Question 3

What are the main limitations of diffusion models?

Accepted Answer

The primary limitation is slow sampling speed due to the need for many iterative denoising steps. This can make real-time generation challenging. Additionally, the models are computationally expensive to train and require large datasets. Recent methods like latent diffusion and DDIM have mitigated some of these issues by reducing the number of steps or operating in a compressed latent space.

Diffusion Model

Diffusion Model

Why it matters

First appeared

FAQ

How does it work?

How is a diffusion model different from a GAN?

What are the main limitations of diffusion models?