Z-Forcing: Training Stochastic Recurrent Networks

Anirudh Goyal,Alessandro Sordoni,Marc-Alexandre Côté,Nan Rosemary Ke,Yoshua Bengio

Published 2017 in Neural Information Processing Systems

ABSTRACT

Many efforts have been devoted to training generative latent variable models with autoregressive decoders, such as recurrent neural networks (RNN). Stochastic recurrent models have been successful in capturing the variability observed in natural sequential data such as speech. We unify successful ideas from recently proposed architectures into a stochastic recurrent model: each step in the sequence is associated with a latent variable that is used to condition the recurrent dynamics for future steps. Training is performed with amortized variational inference where the approximate posterior is augmented with a RNN that runs backward through the sequence. In addition to maximizing the variational lower bound, we ease training of the latent variables by adding an auxiliary cost which forces them to reconstruct the state of the backward recurrent network. This provides the latent variables with a task-independent objective that enhances the performance of the overall model. We found this strategy to perform better than alternative approaches such as KL annealing. Although being conceptually simple, our model achieves state-of-the-art results on standard speech benchmarks such as TIMIT and Blizzard and competitive performance on sequential MNIST. Finally, we apply our model to language modeling on the IMDB dataset where the auxiliary cost helps in learning interpretable latent variables. Source Code: \url{this https URL}

PUBLICATION RECORD

Publication year
2017
Venue
Neural Information Processing Systems
Publication date
2017-11-15
Fields of study
Mathematics, Computer Science
Identifiers
arXiv 1711.05411
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

A Hybrid Convolutional Variational Autoencoder for Text Generation
2017cited by this paper
Controllable Text Generation
2017cited by this paper
Multiplicative Normalizing Flows for Variational Bayesian Neural Networks
2017cited by this paper
Toward Controlled Generation of Text
2017cited by this paper
Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders
2017cited by this paper
Piecewise Latent Variables for Neural Variational Text Processing
2016influential reference
PixelVAE: A Latent Variable Model for Natural Images
2016cited by this paper
Variational Lossy Autoencoder
2016influential reference
Pixel Recurrent Neural Networks
2016influential reference
Theano: A Python framework for fast computation of mathematical expressions
2016cited by this paper
Professor Forcing: A New Algorithm for Training Recurrent Networks
2016influential reference
Neural Autoregressive Distribution Estimation
2016influential reference
Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data
2016cited by this paper
Improved Variational Inference with Inverse Autoregressive Flow
2016influential reference
A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues
2016cited by this paper
An Architecture for Deep, Hierarchical Generative Models
2016influential reference
Sequential Neural Models with Stochastic Layers
2016influential reference
Variational Inference with Normalizing Flows
2015cited by this paper
Generating Sentences from a Continuous Space
2015influential reference
MADE: Masked Autoencoder for Distribution Estimation
2015influential reference
A Recurrent Latent Variable Model for Sequential Data
2015influential reference
DRAW: A Recurrent Neural Network For Image Generation
2015influential reference
Importance Weighted Autoencoders
2015cited by this paper
Training Deep Generative Models: Variations on a Theme
2015cited by this paper
Iterative Neural Autoregressive Distribution Estimator NADE-k
2014influential reference
Learning Stochastic Recurrent Networks
2014influential reference
Adam: A Method for Stochastic Optimization
2014cited by this paper
Semi-supervised Learning with Deep Generative Models
2014cited by this paper
Markov Chain Monte Carlo and Variational Inference: Bridging the Gap
2014influential reference
Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARS)
2014cited by this paper
The Blizzard Challenge 2013
2013cited by this paper
Stochastic Gradient VB and the Variational Auto-Encoder
2013influential reference
Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription
2012cited by this paper
The Neural Autoregressive Distribution Estimator
2011influential reference
On the quantitative analysis of deep belief networks
2008influential reference
Evaluating probabilities under high-dimensional latent variable models
2008cited by this paper
Variational learning and bits-back coding: an information-theoretic view to Bayesian learning
2004cited by this paper
Long Short-Term Memory
1997cited by this paper
Learning representations by back-propagating errors
1986cited by this paper

CITED BY

A noise robust and distribution-adaptive framework for multivariate time series anomaly detection.
2026cites this paper
DySS: Dynamic Queries and State-Space Learning for Efficient 3D Object Detection from Multi-Camera Videos
2025cites this paper
History matching of a three dimensional channelized reservoir using a beta-convolutional variational autoencoder and ensemble smoother with multiple data assimilation
2025cites this paper
Disentangling Speech Representations Learning With Latent Diffusion for Speaker Verification
2025cites this paper
Disentangling Speaker and Content in Pre-trained Speech Models with Latent Diffusion for Robust Speaker Verification
2025cites this paper
Data augmentation for forecasting industrial aging processes via conditional multimodal generative time-series models
2025cites this paper
Bayesian Meta-Reinforcement Learning with Laplace Variational Recurrent Networks
2025cites this paper
A Hierarchical Taxonomy For Deep State Space Models
2025cites this paper
Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction
2025cites this paper
Variational Prefix Tuning for Diverse and Accurate Code Summarization Using Pre-trained Language Models
2025cites this paper
Leveraging Generative Adversarial Networks for Unsupervised Fraud Detection
2025cites this paper
Interactive sequential generative models for team sports
2025influential citation
Epileptic Seizure Detection Based on Attitude Angle Signal of Wearable Device
2025cites this paper
Learning Interpretable Representations Leads to Semantically Faithful EEG-to-Text Generation
2025cites this paper
Characterizing the sense of agency in human–robot interaction based on the free energy principle
2025cites this paper
Spatial-temporal context-aware network for 3D-Craft generation
2025cites this paper
Unifying complete and incomplete multi-view clustering through an information-theoretic generative model
2024cites this paper
Efficient Continual Learning for Small Language Models with a Discrete Key-Value Bottleneck
2024cites this paper
Deep Generative Modeling for Identification of Noisy, Non-Stationary Dynamical Systems
2024cites this paper
Machine cross-domain remaining useful life prediction via contrastive adversarial variational recurrent method
2024cites this paper
Learning Physics for Unveiling Hidden Earthquake Ground Motions via Conditional Generative Modeling
2024cites this paper
Enhanced stochastic recurrent hybrid model for RUL Predictions via Semi-supervised learning
2024cites this paper
FutureDepth: Learning to Predict the Future Improves Video Depth Estimation
2024cites this paper
Improving disentanglement in variational auto-encoders via feature imbalance-informed dimension weighting
2024cites this paper
Causal Recurrent Variational Autoencoder for Medical Time Series Generation
2023influential citation
Learning Mixture Structure on Multi-Source Time Series for Probabilistic Forecasting
2023cites this paper
Time-series Generation by Contrastive Imitation
2023cites this paper
Guiding Language Model Reasoning with Planning Tokens
2023influential citation
Sentiment Analysis for Directional Stock Market Prediction: Advance Data Processing by Leveraging Multiple Data Sources
2023cites this paper
Addressing posterior collapse by splitting decoders in variational recurrent autoencoders
2023cites this paper
GePSAn: Generative Procedure Step Anticipation in Cooking Videos
2023cites this paper
Diverse Image Captioning via Conditional Variational Autoencoder and Dual Contrastive Learning
2023influential citation
DEIR: Efficient and Robust Exploration through Discriminative-Model-Based Episodic Intrinsic Rewards
2023cites this paper
Deep networks for system identification: a Survey
2023cites this paper
A Stochastic Recurrent Encoder Decoder Network for Multistep Probabilistic Wind Power Predictions
2023cites this paper
Implicit Feature Decoupling with Depthwise Quantization
2022cites this paper
Semi-Supervised Generative Models for Multiagent Trajectories
2022cites this paper
Towards Capturing the Temporal Dynamics for Trajectory Prediction: a Coarse-to-Fine Approach
2022cites this paper
Hierarchical Strategies for Cooperative Multi-Agent Reinforcement Learning
2022cites this paper
State-Regularized Recurrent Neural Networks to Extract Automata and Explain Predictions
2022cites this paper
Recurrence Boosts Diversity! Revisiting Recurrent Latent Variable in Transformer-Based Variational AutoEncoder for Diverse Text Generation
2022cites this paper
Multi-Task Dynamical Systems
2022cites this paper
DynaLAP: Human Activity Recognition in Fixed Protocols via Semi-Supervised Variational Recurrent Neural Networks With Dynamic Priors
2022cites this paper
GLENet: Boosting 3D Object Detectors with Generative Label Uncertainty Estimation
2022cites this paper
AS-IntroVAE: Adversarial Similarity Distance Makes Robust IntroVAE
2022influential citation
Weakly-Supervised Generation and Grounding of Visual Descriptions with Conditional Generative Models
2022cites this paper
Evaluation of creating scoring opportunities for teammates in soccer via trajectory prediction
2022cites this paper
Estimating Counterfactual Treatment Outcomes Over Time in Complex Multiagent Scenarios
2022cites this paper
Competency Assessment for Autonomous Agents using Deep Generative Models
2022cites this paper
SocialVAE: Human Trajectory Prediction using Timewise Latents
2022cites this paper
Interpretable Latent Variables in Deep State Space Models
2022cites this paper
Data-to-text Generation with Variational Sequential Planning
2022cites this paper
Benchmarking Generative Latent Variable Models for Speech
2022cites this paper
A Generative Feature-to-Image Robotic Vision Framework for 6D Pose Measurement of Metal Parts
2022cites this paper
Latent Target-Opinion as Prior for Document-Level Sentiment Classification: A Variational Approach from Fine-Grained Perspective
2021cites this paper
Dual-CLVSA: a Novel Deep Learning Approach to Predict Financial Markets with Sentiment Measurements
2021cites this paper
Dual-View Conditional Variational Auto-Encoder for Emotional Dialogue Generation
2021cites this paper
Estimating the Value-at-Risk by Temporal VAE
2021cites this paper
History Marginalization Improves Forecasting in Variational Recurrent Neural Networks
2021influential citation
Modeling Irregular Time Series with Continuous Recurrent Units
2021influential citation
Deep Neural Networks and End-to-End Learning for Audio Compression
2021cites this paper
Toward Automatically Labeling Situations in Soccer
2021influential citation
Contrastively Disentangled Sequential Variational Autoencoder
2021cites this paper
Stochastic Temporal Difference Learning for Sequence Data
2021cites this paper
LSD-StructureNet: Modeling Levels of Structural Detail in 3D Part Hierarchies
2021cites this paper
Policy Gradients Incorporating the Future
2021cites this paper
GenRadar: Self-supervised Probabilistic Camera Synthesis based on Radar Frequencies
2021cites this paper
Innovations Autoencoder and its Application in One-class Anomalous Sequence Detection
2021cites this paper
CatVRNN: Generating Category Texts via Multi-task Learning
2021cites this paper
Incorporate Maximum Mean Discrepancy in Recurrent Latent Space for Sequential Generative Model
2021cites this paper
A Benchmark of Dynamical Variational Autoencoders applied to Speech Spectrogram Modeling
2021cites this paper
Forecasting Reservoir Inflow via Recurrent Neural ODEs
2021influential citation
Stochastic Recurrent Neural Network for Multistep Time Series Forecasting
2021cites this paper
Visual Alignment Constraint for Continuous Sign Language Recognition
2021cites this paper
VDSM: Unsupervised Video Disentanglement with State-Space Modeling and Deep Mixtures of Experts
2021cites this paper
Kanerva++: extending The Kanerva Machine with differentiable, locally block allocated latent memory
2021influential citation
MuseBar: Alleviating Posterior Collapse in VAE Towards Music Generation
2021cites this paper
Deep Learning for Text Attribute Transfer on Auto Encoder Models
2021cites this paper
VAE^2: Preventing Posterior Collapse of Variational Video Predictions in the Wild
2021cites this paper
Learning deep autoregressive models for hierarchical data
2021cites this paper
Disentangled Sequence Clustering for Human Intention Inference
2021cites this paper
Giving Attention to Generative VAE Models for De Novo Molecular Design
2021cites this paper
Variational Dynamic Mixtures
2020cites this paper
Relational State-Space Model for Stochastic Multi-Object Systems
2020influential citation
Deep Bayesian Data Mining
2020cites this paper
Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue
2020cites this paper
Latent Emotion Memory for Multi-Label Emotion Classification
2020cites this paper
Session-based recommendation via flow-based deep generative networks and Bayesian inference
2020cites this paper
Variational Transformers for Diverse Response Generation
2020influential citation
Deep State Space Models for Nonlinear System Identification
2020cites this paper
Unsupervised Multimodal Video-to-Video Translation via Self-Supervised Learning
2020cites this paper
Bayesian Graph Neural Networks with Adaptive Connection Sampling
2020cites this paper
Dynamic Neural Relational Inference
2020cites this paper
A Generative Model for Joint Natural Language Understanding and Generation
2020cites this paper
Deep State-Space Model for Noise Tolerant Skeleton-Based Action Recognition
2020cites this paper
Topic-Enhanced Capsule Network for Multi-Label Emotion Classification
2020cites this paper
Decentralized policy learning with partial observation and mechanical constraints for multiperson modeling
2020cites this paper
Dynamic Neural Relational Inference for Forecasting Trajectories
2020cites this paper
An LSTM-Based Model for Person Identification Using ECG Signal
2020cites this paper
Dynamical Variational Autoencoders: A Comprehensive Review
2020influential citation