A Loss Function for Generative Neural Networks Based on Watson's Perceptual Model

Steffen Czolbe,Oswin Krause,Ingemar J. Cox,C. Igel

Published 2020 in Neural Information Processing Systems

ABSTRACT

To train Variational Autoencoders (VAEs) to generate realistic imagery requires a loss function that reflects human perception of image similarity. We propose such a loss function based on Watson's perceptual model, which computes a weighted distance in frequency space and accounts for luminance and contrast masking. We extend the model to color images, increase its robustness to translation by using the Fourier Transform, remove artifacts due to splitting the image into blocks, and make it differentiable. In experiments, VAEs trained with the new loss function generated realistic, high-quality image samples. Compared to using the Euclidean distance and the Structural Similarity Index, the images were less blurry; compared to deep neural network based losses, the new approach required less computational resources and generated images with less artifacts.

PUBLICATION RECORD

Publication year
2020
Venue
Neural Information Processing Systems
Publication date
2020-06-26
Fields of study
Mathematics, Computer Science, Engineering
Identifiers
arXiv 2006.15057
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Deepfakes: Trick or treat?
2020cited by this paper
E-LPIPS: Robust Perceptual Image Similarity via Random Transformation Ensembles
2019cited by this paper
GENERATIVE ADVERSARIAL NETS
2018cited by this paper
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
2018influential reference
A General and Adaptive Robust Loss Function
2017influential reference
Automatic differentiation in PyTorch
2017cited by this paper
Generating Images with Perceptual Similarity Metrics based on Deep Networks
2016cited by this paper
SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size
2016cited by this paper
Deep Feature Consistent Variational Autoencoder
2016influential reference
A note on the evaluation of generative models
2015cited by this paper
Deep multi-scale video prediction beyond mean square error
2015cited by this paper
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
2015cited by this paper
Autoencoding beyond pixels using a learned similarity metric
2015cited by this paper
Very Deep Convolutional Networks for Large-Scale Image Recognition
2014cited by this paper
Adam: A Method for Stochastic Optimization
2014cited by this paper
Deep Learning Face Attributes in the Wild
2014cited by this paper
Auto-Encoding Variational Bayes
2013influential reference
Rectifier Nonlinearities Improve Neural Network Acoustic Models
2013cited by this paper
ImageNet classification with deep convolutional neural networks
2012cited by this paper
Digital Watermarking and Steganography
2009cited by this paper
Using Perceptual Models to Improve Fidelity and Provide Resistance to Valumetric Scaling for Quantization Index Modulation Watermarking
2007cited by this paper
Image quality assessment: from error visibility to structural similarity
2004influential reference
Gradient-based learning applied to document recognition
1998cited by this paper
Image Compression Using the Discrete Cosine Transform
1994cited by this paper
DCT quantization matrices visually optimized for individual images
1993cited by this paper
The JPEG still picture compression standard
1991cited by this paper
The cortex transform: rapid computation of simulated neural images
1987cited by this paper
An experimental comparison of RGB, YIQ, LAB, HSV, and opponent color models
1987cited by this paper
Model of human visual-motion sensing.
1985cited by this paper
Color gamut transform pairs
1978cited by this paper

CITED BY

Lossless Copyright Protection via Intrinsic Model Fingerprinting
2026cites this paper
Pixel Seal: Adversarial-only training for invisible image and video watermarking
2025cites this paper
RePack then Refine: Efficient Diffusion Transformer with Vision Foundation Model
2025influential citation
LVMO: Stable Image Watermarking Based on Multi-Loss Optimization for Diffusion Latent Vector
2025cites this paper
RAVQ-HoloNet: Rate-Adaptive Vector-Quantized Hologram Compression
2025cites this paper
Video Signature: Implicit Watermarking for Video Diffusion Models
2025cites this paper
An Efficient and Adaptive Watermark Detection System with Tile-based Error Correction
2025cites this paper
Video Signature: In-generation Watermarking for Latent Video Diffusion Models
2025cites this paper
VIDSTAMP: A Temporally-Aware Watermark for Ownership and Integrity in Video Diffusion Models
2025cites this paper
Spread Spectrum Image Watermarking Through Latent Diffusion Model
2025cites this paper
MultiNeRF: Multiple Watermark Embedding for Neural Radiance Fields
2025cites this paper
Spatial–Temporal Coherence in Extreme Video Retargeting for Consumer Screening Devices
2025cites this paper
Pathology-Aware Adaptive Watermarking for Text-Driven Medical Image Synthesis
2025cites this paper
Attack-Resilient Image Watermarking Using Stable Diffusion
2024cites this paper
Benchmarking the Robustness of Image Watermarks
2024cites this paper
ActiveDaemon: Unconscious DNN Dormancy and Waking Up via User-specific Invisible Token
2024cites this paper
Spatio-Temporal Consistent Non-homogeneous Extreme Video Retargeting
2024cites this paper
WateRF: Robust Watermarks in Radiance Fields for Protection of Copyrights
2024cites this paper
Stable Signature is Unstable: Removing Image Watermark from Diffusion Models
2024cites this paper
UnMarker: A Universal Attack on Defensive Image Watermarking
2024cites this paper
WMAdapter: Adding WaterMark Control to Latent Diffusion Models
2024cites this paper
LAVIB: A Large-scale Video Interpolation Benchmark
2024cites this paper
Towards Effective User Attribution for Latent Diffusion Models via Watermark-Informed Blending
2024cites this paper
CSIM: A Copula-based similarity index sensitive to local changes for Image quality assessment
2024cites this paper
Perceptual Visual Similarity from EEG: Prediction and Image Generation
2024cites this paper
An Efficient Watermarking Method for Latent Diffusion Models via Low-Rank Adaptation
2024cites this paper
Boosting Latent Diffusion with Perceptual Objectives
2024cites this paper
LVMark: Robust Watermark for latent video diffusion models
2024cites this paper
Video Seal: Open and Efficient Video Watermarking
2024cites this paper
Robust Image Watermarking using Stable Diffusion
2024cites this paper
Balanced Marginal and Joint Distributional Learning via Mixture Cramer-Wold Distance
2023cites this paper
AesFA: An Aesthetic Feature-Aware Arbitrary Neural Style Transfer
2023cites this paper
Enhancing Common Loss Functions: “Append” with Overall Metrics as Supplementary
2023cites this paper
Explicitly Minimizing the Blur Error of Variational Autoencoders
2023cites this paper
The Stable Signature: Rooting Watermarks in Latent Diffusion Models
2023cites this paper
Semantic similarity metrics for image registration
2023cites this paper
Explainable Recommender With Geometric Information Bottleneck
2023cites this paper
Attacking Perceptual Similarity Metrics
2023influential citation
Enhancing image quality prediction with self‐supervised visual masking
2023cites this paper
Gaze-contingent efficient hologram compression for foveated near-eye holographic displays
2023cites this paper
Distributional Learning of Variational AutoEncoder: Application to Synthetic Data Generation
2023cites this paper
Forest Parameter Prediction by Multiobjective Deep Learning of Regression Models Trained with Pseudo-Target Imputation
2023cites this paper
FrePolad: Frequency-Rectified Point Latent Diffusion for Point Cloud Generation
2023cites this paper
Shift-tolerant Perceptual Similarity Metric
2022cites this paper
Joint neural phase retrieval and compression for energy- and computation-efficient holography on the edge
2022cites this paper
Investigating Prompt Engineering in Diffusion Models
2022cites this paper
A Perceptual Quality Metric for Video Frame Interpolation
2022influential citation
Hogel-Free Holography
2022cites this paper
Pupil-Aware Holography
2022cites this paper
Simpler is Better: Spectral Regularization and Up-Sampling Techniques for Variational Autoencoders
2022influential citation
Evaluating the Interpretability of Generative Models by Interactive Reconstruction
2021cites this paper
Semantic similarity metrics for learned image registration
2021cites this paper
Clean Images are Hard to Reblur: Exploiting the Ill-Posed Inverse Task for Dynamic Scene Deblurring
2021cites this paper
DeepSim: Semantic similarity metrics for learned image registration
2020cites this paper
Frequency Domain Image Translation: More Photo-realistic, Better Identity-preserving
2020cites this paper
Boosting Latent Diffusion with Perceptual Objectives
year unknowncites this paper
Boosting Latent Diffusion with Perceptual Objectives
year unknowncites this paper
Boosting Latent Diffusion with Perceptual Objectives
year unknowncites this paper
Boosting Latent Diffusion with Perceptual Objectives
year unknowncites this paper
Boosting Latent Diffusion with Perceptual Objectives
year unknowncites this paper
Boosting Latent Diffusion with Perceptual Objectives
year unknowncites this paper
Boosting Latent Diffusion with Perceptual Objectives
year unknowncites this paper
Boosting Latent Diffusion with Perceptual Objectives
year unknowncites this paper