Elucidating the Design Space of Diffusion-Based Generative Models

Tero Karras,M. Aittala,Timo Aila,S. Laine

Published 2022 in Neural Information Processing Systems

ABSTRACT

We argue that the theory and practice of diffusion-based generative models are currently unnecessarily convoluted and seek to remedy the situation by presenting a design space that clearly separates the concrete design choices. This lets us identify several changes to both the sampling and training processes, as well as preconditioning of the score networks. Together, our improvements yield new state-of-the-art FID of 1.79 for CIFAR-10 in a class-conditional setting and 1.97 in an unconditional setting, with much faster sampling (35 network evaluations per image) than prior designs. To further demonstrate their modular nature, we show that our design changes dramatically improve both the efficiency and quality obtainable with pre-trained score networks from previous work, including improving the FID of a previously trained ImageNet-64 model from 2.07 to near-SOTA 1.55, and after re-training with our proposed improvements to a new SOTA of 1.36.

PUBLICATION RECORD

Publication year
2022
Venue
Neural Information Processing Systems
Publication date
2022-06-01
Fields of study
Mathematics, Computer Science
Identifiers
DOI 10.48550/arXiv.2206.00364 arXiv 2206.00364
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets
2022cited by this paper
Analytic-DPM: an Analytic Estimate of the Optimal Reverse Variance in Diffusion Probabilistic Models
2022cited by this paper
DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps
2022cited by this paper
Classifier-Free Diffusion Guidance
2022influential reference
Subspace Diffusion Generative Models
2022cited by this paper
Fast Sampling of Diffusion Models with Exponential Integrator
2022cited by this paper
Hierarchical Text-Conditional Image Generation with CLIP Latents
2022cited by this paper
Video Diffusion Models
2022cited by this paper
Perception Prioritized Training of Diffusion Models
2022cited by this paper
Pseudo Numerical Methods for Diffusion Models on Manifolds
2022cited by this paper
Learning Fast Samplers for Diffusion Models by Differentiating Through Sample Quality
2022cited by this paper
Alias-Free Generative Adversarial Networks
2021cited by this paper
Generative Adversarial Networks
2021cited by this paper
Knowledge Distillation in Iterative Generative Models for Improved Sampling Speed
2021cited by this paper
Improved Denoising Diffusion Probabilistic Models
2021influential reference
Diffusion Models Beat GANs on Image Synthesis
2021influential reference
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
2021cited by this paper
Gotta Go Fast When Generating Data with Score-Based Models
2021influential reference
A Variational Perspective on Diffusion-Based Generative Models and Score Matching
2021cited by this paper
Learning to Efficiently Sample from Diffusion Probabilistic Models
2021cited by this paper
Score-based Generative Modeling in Latent Space
2021cited by this paper
Cascaded Diffusion Models for High Fidelity Image Generation
2021influential reference
Diffusion Normalizing Flow
2021cited by this paper
Zero-Shot Translation using Diffusion Models
2021cited by this paper
Palette: Image-to-Image Diffusion Models
2021cited by this paper
Diffusion Autoencoders: Toward a Meaningful and Decodable Representation
2021cited by this paper
Diffusion Models for Implicit Image Segmentation Ensembles
2021cited by this paper
Label-Efficient Semantic Segmentation with Diffusion Models
2021cited by this paper
Score-Based Generative Modeling with Critically-Damped Langevin Diffusion
2021cited by this paper
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
2021cited by this paper
High-Resolution Image Synthesis with Latent Diffusion Models
2021cited by this paper
Denoising Diffusion Probabilistic Models
2020cited by this paper
Score-Based Generative Modeling through Stochastic Differential Equations
2020influential reference
Training Generative Adversarial Networks with Limited Data
2020cited by this paper
Denoising Diffusion Implicit Models
2020influential reference
Normalization Techniques in Training DNNs: Methodology, Analysis and Application
2020cited by this paper
Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains
2020cited by this paper
DiffWave: A Versatile Diffusion Model for Audio Synthesis
2020cited by this paper
Generative Modeling by Estimating Gradients of the Data Distribution
2019cited by this paper
StarGAN v2: Diverse Image Synthesis for Multiple Domains
2019cited by this paper
Noise2Noise: Learning Image Restoration without Clean Data
2018cited by this paper
A Style-Based Generator Architecture for Generative Adversarial Networks
2018cited by this paper
GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium
2017cited by this paper
Rethinking the Inception Architecture for Computer Vision
2015cited by this paper
Deep Unsupervised Learning using Nonequilibrium Thermodynamics
2015cited by this paper
Modify the Improved Euler scheme to integrate stochastic differential equations
2012cited by this paper
A Connection Between Score Matching and Denoising Autoencoders
2011cited by this paper
ImageNet: A large-scale hierarchical image database
2009influential reference
Théorie analytique de la chaleur
2009cited by this paper
Learning Multiple Layers of Features from Tiny Images
2009influential reference
Estimation of Non-Normalized Statistical Models by Score Matching
2005influential reference
Introduction to Numerical Analysis
2001cited by this paper
Computer methods for ordinary differential equations and differential-algebraic equations
1998cited by this paper
REPRESENTATIONS OF KNOWLEDGE IN COMPLEX SYSTEMS
1994cited by this paper
Neural Networks for Pattern Recognition
1993cited by this paper
Reverse-time diffusion equation models
1982cited by this paper
A family of embedded Runge-Kutta formulae
1980cited by this paper

CITED BY

Predict-Project-Renoise: Sampling Diffusion Models under Hard Constraints
2026cites this paper
Benchmarking Uncertainty Quantification of Plug-and-Play Diffusion Priors for Inverse Problems Solving
2026cites this paper
VARDiff: Vision-augmented retrieval-guided diffusion for stock forecasting
2026cites this paper
Revisiting Diffusion Model Predictions Through Dimensionality
2026cites this paper
Pareto-Conditioned Diffusion Models for Offline Multi-Objective Optimization
2026influential citation
Rethinking the Design Space of Reinforcement Learning for Diffusion Models: On the Importance of Likelihood Estimation Beyond Loss Design
2026cites this paper
NADD: Amplifying Noise for Effective Diffusion-based Adversarial Purification
2026influential citation
Visible-Light-Guided Infrared Image Super Resolution With Dual Amplitude-Phase Optimization
2026cites this paper
Demystifying Data-Driven Probabilistic Medium-Range Weather Forecasting
2026influential citation
Entropy-Based Dimension-Free Convergence and Loss-Adaptive Schedules for Diffusion Models
2026cites this paper
Enhancing Diffusion-Based Quantitatively Controllable Image Generation via Matrix-Form EDM and Adaptive Vicinal Training
2026influential citation
Stochastic Interpolants in Hilbert Spaces
2026cites this paper
Distance Marching for Generative Modeling
2026cites this paper
Generative Artificial Intelligence creates delicious, sustainable, and nutritious burgers
2026cites this paper
Enhancing Massive MIMO Symbol Detection in Unknown Noise Environments: A Generative Modeling Approach
2026cites this paper
Variance-Reduced Diffusion Sampling via Target Score Identity
2026influential citation
A Complete Decomposition of Stochastic Differential Equations
2026influential citation
DSA-Diff: Dynamic schedule alignment for training-Inference consistent modality translation in x-prediction diffusion model.
2026cites this paper
Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning
2026cites this paper
ART for Diffusion Sampling: A Reinforcement Learning Approach to Timestep Schedule
2026influential citation
DeRaDiff: Denoising Time Realignment of Diffusion Models
2026cites this paper
Prior-Informed Flow Matching for Graph Reconstruction
2026cites this paper
Conformal Prediction for Generative Models via Adaptive Cluster-Based Density Estimation
2026cites this paper
SSI-DM: Singularity Skipping Inversion of Diffusion Models
2026influential citation
FaceSnap: Enhanced ID-Fidelity Network for Tuning-Free Portrait Customization
2026influential citation
SADER: Structure-Aware Diffusion Framework with DEterministic Resampling for Multi-Temporal Remote Sensing Cloud Removal
2026cites this paper
Stabilizing Diffusion Posterior Sampling by Noise--Frequency Continuation
2026cites this paper
Sparsely Supervised Diffusion
2026cites this paper
Semantics-Aware Generative Latent Data Augmentation for Learning in Low-Resource Domains
2026cites this paper
ConsistentRFT: Reducing Visual Hallucinations in Flow-based Reinforcement Fine-Tuning
2026cites this paper
Diff-Mamba: A diffusion-Mamba framework for hyperspectral image classification
2026cites this paper
Generative artificial intelligence and marine ecological monitoring
2026cites this paper
Noise-Conditioned Adversarial Diffusion Denoising Implicit Model (NCA-DDIM) for Data Augmentation in Engineering Applications
2026cites this paper
ExposeAnyone: Personalized Audio-to-Expression Diffusion Models Are Robust Zero-Shot Face Forgery Detectors
2026cites this paper
Annealed Langevin Posterior Sampling (ALPS): A Rapid Algorithm for Image Restoration with Multiscale Energy Models
2026influential citation
GDiT: A graph-prior-guided diffusion transformer for semantic-controllable remote sensing image synthesis
2026cites this paper
Accelerating Diffusion-Based Denoising Model With Optimized Time Steps
2026cites this paper
GenDA: Generative Data Assimilation on Complex Urban Areas via Classifier-Free Diffusion Guidance
2026cites this paper
Diffusion Representations for Fine-Grained Image Classification: A Marine Plankton Case Study
2026cites this paper
Communication-efficient Federated Graph Classification via Generative Diffusion Modeling
2026cites this paper
Latent Diffusion for Internet of Things Attack Data Generation in Intrusion Detection
2026cites this paper
Splat-Portrait: Generalizing Talking Heads with Gaussian Splatting
2026cites this paper
Divergence-Free Diffusion Models for Incompressible Fluid Flows
2026influential citation
Deep Neural Networks as Iterated Function Systems and a Generalization Bound
2026cites this paper
Exploring the transformer-based and diffusion-based models for single image deblurring
2026cites this paper
On Forgetting and Stability of Score-based Generative models
2026cites this paper
Zero-Shot Video Restoration and Enhancement with Assistance of Video Diffusion Models
2026cites this paper
Particle-Guided Diffusion Models for Partial Differential Equations
2026influential citation
SurrogateSHAP: Training-Free Contributor Attribution for Text-to-Image (T2I) Models
2026cites this paper
Cascaded Flow Matching for Heterogeneous Tabular Data with Mixed-Type Features
2026cites this paper
GLAD: Generative Language-Assisted Visual Tracking for Low-Semantic Templates
2026cites this paper
GEPC: Group-Equivariant Posterior Consistency for Out-of-Distribution Detection in Diffusion Models
2026cites this paper
DIAMOND: Directed Inference for Artifact Mitigation in Flow Matching Models
2026cites this paper
Sampling-Free Diffusion Transformers for Low-Complexity MIMO Channel Estimation
2026cites this paper
Unlocking the Duality between Flow and Field Matching
2026influential citation
Zero-Flow Encoders
2026cites this paper
On Stability and Robustness of Diffusion Posterior Sampling for Bayesian Inverse Problems
2026cites this paper
SSNAPS: Audio-Visual Separation of Speech and Background Noise with Diffusion Inverse Sampling
2026influential citation
Fast Sampling for Flows and Diffusions with Lazy and Point Mass Stochastic Interpolants
2026cites this paper
Ultra Fast PDE Solving via Physics Guided Few-step Diffusion
2026influential citation
Expert-Data Alignment Governs Generation Quality in Decentralized Diffusion Models
2026cites this paper
Spectral Evolution Search: Efficient Inference-Time Scaling for Reward-Aligned Image Generation
2026cites this paper
Bridging User Dynamic Preferences: A Unified Bridge-Based Diffusion Model for Next POI Recommendation
2026cites this paper
Image super-resolution reconstruction based on conditional diffusion model in crack detection and segmentation of shield tunnels
2026cites this paper
Progressive Learning of Instance-Level Proxy Semantics for Few-Shot Action Recognition
2026cites this paper
FEANet: A Novel Fusion-Enhanced Dual-Subnetwork for Antinoise Hyperpansharpening
2026cites this paper
DM-VSR: Depth-Aware Diffusion Models With Adaptive Modulation for Video Super-Resolution
2026cites this paper
BiKC+: Bimanual Hierarchical Imitation With Keypose-Conditioned Coordination-Aware Consistency Policies
2026cites this paper
A transformer partial discharge detection method based on MambaGAN-driven data augmentation
2026cites this paper
Categorical Reparameterization with Denoising Diffusion models
2026cites this paper
Luminark: Training-free, Probabilistically-Certified Watermarking for General Vision Generative Models
2026cites this paper
Document image shadow removal via score-based gradient-guided generative model
2026cites this paper
VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control
2026cites this paper
Efficient inverse reconstruction of internal flow fields in rocket engine combustion chambers based on diffusion models
2026cites this paper
GlobalPaint: Spatiotemporal Coherent Video Outpainting with Global Feature Guidance
2026cites this paper
PredLDM: Spatiotemporal Sequence Prediction with Latent Diffusion Models
2026cites this paper
Super-resolution and denoising of corneal B-scan OCT imaging using diffusion model plug-and-play priors
2026cites this paper
End-to-end reconstruction of OCT optical properties and speckle-reduced structural intensity via physics-based learning
2026cites this paper
Conditional Diffusion-Guided Visual State Space Model for Semantic Segmentation of Optical Remote Sensing Images
2026cites this paper
An Elementary Approach to Scheduling in Generative Diffusion Models
2026cites this paper
A one-step generation model with a Single-Layer Transformer: Layer number re-distillation of FreeFlow
2026cites this paper
Mirai: Autoregressive Visual Generation Needs Foresight
2026cites this paper
Ambient Dataloops: Generative Models for Dataset Refinement
2026cites this paper
Diffusion Model Driven Airfoil Design: From Geometry Encoding to Practical Applications
2026influential citation
Neural–network quantum state tomography in four–photon entanglement systems
2026cites this paper
Analyzing the Error of Generative Diffusion Models: From Euler-Maruyama to Higher-Order Schemes
2026cites this paper
Learning Accurate Storm-Scale Evolution from Observations
2026influential citation
DREAMSTATE: Diffusing States and Parameters for Recurrent Large Language Models
2026cites this paper
Benchmarking the geographic generalization of deep learning models for precipitation downscaling
2026cites this paper
An Improved Diffusion Model for Generating Images of a Single Category of Food on a Small Dataset
2026cites this paper
Can Continuous-Time Diffusion Models Generate and Solve Globally Constrained Discrete Problems? A Study on Sudoku
2026cites this paper
MeanCache: From Instantaneous to Average Velocity for Accelerating Flow Matching Inference
2026cites this paper
One-step Latent-free Image Generation with Pixel Mean Flows
2026cites this paper
PHDME: Physics-Informed Diffusion Models without Explicit Governing Equations
2026cites this paper
Noise as a Probe: Membership Inference Attacks on Diffusion Models Leveraging Initial Noise
2026cites this paper
PILD: Physics-Informed Learning via Diffusion
2026cites this paper
Conditional Denoising Model as a Physical Surrogate Model
2026cites this paper
HP-GAN: Harnessing pretrained networks for GAN improvement with FakeTwins and discriminator consistency.
2026cites this paper
VideoGPA: Distilling Geometry Priors for 3D-Consistent Video Generation
2026cites this paper
Shiva-DiT: Residual-Based Differentiable Top-$k$ Selection for Efficient Diffusion Transformers
2026cites this paper