Dilated Recurrent Neural Networks

Shiyu Chang,Yang Zhang,Wei Han,Mo Yu,Xiaoxiao Guo,Wei Tan,Xiaodong Cui,M. Witbrock,M. Hasegawa-Johnson,Thomas S. Huang

Published 2017 in Neural Information Processing Systems

ABSTRACT

Learning with recurrent neural networks (RNNs) on long sequences is a notoriously difficult task. There are three major challenges: 1) complex dependencies, 2) vanishing and exploding gradients, and 3) efficient parallelization. In this paper, we introduce a simple yet effective RNN connection structure, the DilatedRNN, which simultaneously tackles all of these challenges. The proposed architecture is characterized by multi-resolution dilated recurrent skip connections and can be combined flexibly with diverse RNN cells. Moreover, the DilatedRNN reduces the number of parameters needed and enhances training efficiency significantly, while matching state-of-the-art performance (even with standard RNN cells) in tasks involving very long-term dependencies. To provide a theory-based quantification of the architecture's advantages, we introduce a memory capacity measure, the mean recurrent length, which is more suitable for RNNs with long skip connections than existing measures. We rigorously prove the advantages of the DilatedRNN over other recurrent neural architectures. The code for our method is publicly available at this https URL

PUBLICATION RECORD

Publication year
2017
Venue
Neural Information Processing Systems
Publication date
2017-10-05
Fields of study
Computer Science
Identifiers
arXiv 1710.02224
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Dilated Residual Networks
2017cited by this paper
Learning to Skim Text
2017cited by this paper
FeUdal Networks for Hierarchical Reinforcement Learning
2017cited by this paper
Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences
2016cited by this paper
Recurrent Dropout without Memory Loss
2016cited by this paper
HyperNetworks
2016cited by this paper
Recurrent Batch Normalization
2016cited by this paper
Hierarchical Multiscale Recurrent Neural Networks
2016cited by this paper
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
2016cited by this paper
WaveNet: A Generative Model for Raw Audio
2016influential reference
Layer Normalization
2016cited by this paper
Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations
2016cited by this paper
Architectural Complexity Measures of Recurrent Neural Networks
2016cited by this paper
Full-Capacity Unitary Recurrent Neural Networks
2016cited by this paper
SUPERSEDED - CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit
2016cited by this paper
Unitary Evolution Recurrent Neural Networks
2015cited by this paper
Multi-Scale Context Aggregation by Dilated Convolutions
2015cited by this paper
Learning the speech front-end with raw waveform CLDNNs
2015cited by this paper
A Simple Way to Initialize Recurrent Networks of Rectified Linear Units
2015cited by this paper
A Clockwork RNN
2014cited by this paper
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
2014cited by this paper
On the difficulty of training recurrent neural networks
2012cited by this paper
A brief survey on sequence classification
2010cited by this paper
Gradient-based learning applied to document recognition
1998cited by this paper
Long Short-Term Memory
1997cited by this paper
Hierarchical Recurrent Neural Networks for Long-Term Dependencies
1995cited by this paper
Building a Large Annotated Corpus of English: The Penn Treebank
1993cited by this paper
A SYSTEMIC STUDY OF MONETARY SYSTEMS
1982cited by this paper

CITED BY

RAMSeS: Robust and Adaptive Model Selection for Time-Series Anomaly Detection Algorithms
2026cites this paper
Rethinking Recurrent Neural Networks for Time Series Forecasting: A Reinforced Recurrent Encoder with Prediction-Oriented Proximal Policy Optimization
2026cites this paper
VAML‐Net: Unsupervised Anomaly Detection for Multivariate Time Series in Space‐Air‐Ground Integrated Network (SAGIN) Environments Through a Variational Autoencoder and Multiresolution LSTM
2026cites this paper
Multi-Ecosystem Modeling of OSS Project Sustainability
2026cites this paper
TEFormer: Structured Bidirectional Temporal Enhancement Modeling in Spiking Transformers
2026cites this paper
Rising From Pieces: Effective Inference at the Edge via Robust Split ML
2026cites this paper
Dynamic Kolmogorov-Arnold networks for time-varying degradation modeling in solid oxide fuel cells
2026cites this paper
A multispectral pansharpening method based on CNN-DI network with mixture of experts
2025cites this paper
Fourier Basis Mapping: A Time-Frequency Learning Framework for Time Series Forecasting
2025cites this paper
DLiGRU-X: Efficient X-Vector-Based Embeddings for Small-Footprint Keyword Spotting System
2025cites this paper
Deep Residual Echo State Networks: exploring residual orthogonal connections in untrained Recurrent Neural Networks
2025cites this paper
StoxLSTM: A Stochastic Extended Long Short-Term Memory Network for Time Series Forecasting
2025cites this paper
Enhancement of single candidate optimizer for weighted feature fusion and dilation-based cascaded RNN in learning-based recommendation system
2025cites this paper
Multi-scale temporal representation with sparse dynamic graph learning for district heat load forecasting
2025cites this paper
A Systematic Blockchain‐Based Proficient, Secure, and Energetic Privacy‐Preserving Protocol for Effective Authentication in Internet of Vehicles Networks Using the El‐Gamal Encryption With Optimal Key Selection
2025cites this paper
Time-Series Anomaly Detection for Sensor Data: Models, Metrics, and Methodologies—A Review
2025cites this paper
Efficient and robust temporal processing with neural oscillations modulated spiking neural networks
2025influential citation
Modelradar: aspect-based forecast evaluation
2025cites this paper
Clustered Bootstrap Similarity Learning Framework: Time Series Forecasting
2025cites this paper
Could seismo-volcanic catalogs be improved or created using weakly supervised approaches with pre-trained systems?
2025cites this paper
SFFHO: Development of Statistical Fitness-based Fire Hawk Optimizer for Software Testing and Maintenance Approach using Adaptive Deep Learning Method
2025cites this paper
Lead-LagNet: Exploiting Lead-Lag Dependencies for Cross-Series Temporal Prediction
2025cites this paper
Non-Markovianity and memory enhancement in Quantum Reservoir Computing
2025cites this paper
Context-driven Deep Learning Forecasting for Wastewater Treatment Plants
2025cites this paper
SANTM: A Sparse Access Neural Turing Machine with local multi-head self-attention for long-term memorization
2025cites this paper
Anomaly Detection in Event-Triggered Traffic Time Series via Similarity Learning
2025cites this paper
MultiPerG: Multiple Periodic Geography convolution for next POI recommendation
2025cites this paper
A novel dilated weighted recurrent neural network (RNN)-based smart contract for secure sharing of big data in Ethereum blockchain using hybrid encryption schemes
2025cites this paper
Temporal Chunking Enhances Recognition of Implicit Sequential Patterns
2025cites this paper
Dynamic Chunking for End-to-End Hierarchical Sequence Modeling
2025cites this paper
Enhancing short-term net load forecasting with additive neural decomposition and Weibull Attention
2025cites this paper
Lattice protein folding with variational annealing
2025cites this paper
Forecasting-Based Biomedical Time-series Data Synthesis for Open Data and Robust AI
2025cites this paper
An Attention-Based Downsampling GRU Neural Network with Degradation Model for Remaining Useful Life Prediction of Aero-engine
2025cites this paper
Simple and Efficient Multivariate Time Series Forecasting with Sample MLP
2025cites this paper
Reinforcement Learning Based Video Summarization Using Attention Aware Dilated RNN
2025cites this paper
An Explainable Vision Transformer-Based Web Application for Medical Decision-Making: Case of Colon Cancer
2025cites this paper
Emergence of the Primacy Effect in Structured State-Space Models
2025cites this paper
Forecast-Then-Optimize Deep Learning Methods
2025cites this paper
Variational Hierarchical N-BEATS Model for Long-Term Time-Series Forecasting
2025cites this paper
Neural Hierarchical Interpolation Time Series (NHITS) for Reservoir Level Multi-Horizon Forecasting in Hydroelectric Power Plants
2025cites this paper
Learnability Window in Gated Recurrent Neural Networks
2025cites this paper
Forking-Sequences
2025cites this paper
CFTD: Core Fusion Time Series Dense Encoder for Intelligent Prediction With Edge AI in Social IoT Systems
2025cites this paper
MADCluster: Model-agnostic Anomaly Detection with Self-supervised Clustering Network
2025cites this paper
IoT enabled predictive maintenance system in Industry 4.0 using target-based feature pool linked dilated recurrent neural network
2025cites this paper
Adaptive filter-driven optimized attention-based CNN-LSTM for load forecasting in microgrids
2025cites this paper
STEP: A Unified Spiking Transformer Evaluation Platform for Fair and Reproducible Benchmarking
2025cites this paper
Time-series forecasting in smart manufacturing systems: An experimental evaluation of the state-of-the-art algorithms
2025cites this paper
Multi-Task Graph Attention Net for Electricity Consumption Prediction and Anomaly Detection
2025cites this paper
A review of recent hybridized machine learning methodologies for time series forecasting on water-related variables
2025cites this paper
Myosotis: structured computation for attention like layer
2025influential citation
Large Language Models are Zero-Shot Next Location Predictors
2024cites this paper
Listen and Move: Improving GANs Coherency in Agnostic Sound-to-Video Generation
2024cites this paper
Deep temporal semi-supervised one-class classification for GNSS radio frequency interference detection
2024cites this paper
Residual Echo State Networks: Residual recurrent neural networks with stable dynamics and fast learning
2024cites this paper
Dilated-RNNs: A Deep Approach for Continuous Volcano-Seismic Events Recognition
2024influential citation
Graph Expansion in Pruned Recurrent Neural Network Layers Preserve Performance
2024influential citation
Leveraging Non-Decimated Wavelet Packet Features and Transformer Models for Time Series Forecasting
2024cites this paper
Message passing variational autoregressive network for solving intractable Ising models
2024cites this paper
A hybrid connectionist/LCS for hidden-state problems
2024cites this paper
Modeling temporal dual variations for return air temperature prediction of mK-level temperature-controlled clean chamber
2024cites this paper
Efficient State Space Model via Fast Tensor Convolution and Block Diagonalization
2024influential citation
Electroencephalogram-based emotion recognition using factorization temporal separable convolution network
2024cites this paper
Deep Learning Models for Time Series Forecasting: A Review
2024cites this paper
Automatic Synthesis of Recurrent Neurons for Imitation Learning From CNC Machine Operators
2024cites this paper
Concrete Dense Network for Long-Sequence Time Series Clustering
2024cites this paper
Parallelized Spatiotemporal Binding
2024cites this paper
Prediction of adverse events risk in patients with comorbid post-traumatic stress disorder and alcohol use disorder using electronic medical records by deep learning models
2024cites this paper
Positional multi-length and mutual-attention network for epileptic seizure classification
2024cites this paper
High-Speed Nonlinear Circuit Macromodeling Using Hybrid-Module Clockwork Recurrent Neural Network
2024cites this paper
Rethinking Fourier Transform from A Basis Functions Perspective for Long-term Time Series Forecasting
2024cites this paper
HyperZ·Z·W Operator Connects Slow-Fast Networks for Full Context Interaction
2024cites this paper
Graph-based Forecasting with Missing Data through Spatiotemporal Downsampling
2024cites this paper
Deep learning foundation and pattern models: Challenges in hydrological time series
2024cites this paper
Sequential Learning Network With Residual Blocks: Incorporating Temporal Convolutional Information Into Recurrent Neural Networks
2024influential citation
Vista: Machine Learning based Database Performance Troubleshooting Framework in Amazon RDS
2024cites this paper
PGN: The RNN's New Successor is Effective for Long-Range Time Series Forecasting
2024cites this paper
Parametric seasonal-trend autoregressive neural network for long-term crop price forecasting
2024cites this paper
Uncovering Nested Data Parallelism and Data Reuse in DNN Computation with FractalTensor
2024cites this paper
Parallelized Spatiotemporal Slot Binding for Videos
2024cites this paper
Rolling bearing fault diagnosis method based on multi-scale pooling residual convolutional neural network under noisy environment
2024cites this paper
Artificial intelligence in music: recent trends and challenges
2024influential citation
Time Series Foundation Models and Deep Learning Architectures for Earthquake Temporal and Spatial Nowcasting
2024cites this paper
Positional Encoding Helps Recurrent Neural Networks Handle a Large Vocabulary
2024influential citation
Load forecasting method based on CNN and extended LSTM
2024cites this paper
Automatic Design of LSTM Networks with Skip Connections through Evolutionary and Differentiable Architecture Search
2024cites this paper
Quantized Approximately Orthogonal Recurrent Neural Networks
2024cites this paper
An electricity load forecasting model based on multilayer dilated LSTM network and attention mechanism
2023cites this paper
DeepBiomarker2: Prediction of alcohol and substance use disorder risk in post-traumatic stress disorder patients using electronic medical records and multiple social determinants of health
2023cites this paper
Local discriminative graph convolutional networks for text classification
2023cites this paper
Scale-teaching: Robust Multi-scale Training for Time Series Classification with Noisy Labels
2023cites this paper
Adaptive Modularized Recurrent Neural Networks for Electric Load Forecasting
2023cites this paper
Policy gradient empowered LSTM with dynamic skips for irregular time series data
2023cites this paper
Long sequence time-series forecasting with deep learning: A survey
2023cites this paper
Encoding Recurrence into Transformers
2023cites this paper
Environmental Knowledge-Driven Over-the-Horizon Propagation Loss Prediction Based on Short- and Long-Term Parallel Double-Flow TrellisNets
2023cites this paper
Glucose Transformer: Forecasting Glucose Level and Events of Hyperglycemia and Hypoglycemia
2023cites this paper
Unsupervised Deep Learning for IoT Time Series
2023cites this paper
Generative Time Series Forecasting with Diffusion, Denoise, and Disentanglement
2023cites this paper