AI & Generative Media

Diffusion Model

Also known as: Diffusion, Denoising Diffusion

A generative AI architecture that creates images by learning to reverse a gradual noising process, powering systems like Stable Diffusion and DALL-E.

Diffusion models generate images by learning to reverse a process that gradually adds noise to images—starting from pure noise and refining toward a coherent image.

How It Works

Training:

  1. Take real images
  2. Gradually add noise until pure static
  3. Train model to predict and remove noise at each step

Generation:

  1. Start with random noise
  2. Iteratively denoise guided by text prompt
  3. End with coherent image matching description

Why It Works

By learning the reverse of destruction, the model learns the structure of images—what makes an image look like a “cat” or “sunset.”

Variants

  • DDPM: Original diffusion approach
  • Latent Diffusion: Works in compressed space (faster)
  • Stable Diffusion: Open-source latent diffusion
  • SDXL: Higher resolution variant

Advantages

  • High-quality, diverse outputs
  • Fine control through guidance
  • Can be conditioned on various inputs
  • More stable training than GANs

External Resources