How to Construct Deep Recurrent Neural Networks

Razvan Pascanu,Çaglar Gülçehre,Kyunghyun Cho,Yoshua Bengio

Published 2013 in International Conference on Learning Representations

ABSTRACT

In this paper, we explore different ways to extend a recurrent neural network (RNN) to a \textit{deep} RNN. We start by arguing that the concept of depth in an RNN is not as clear as it is in feedforward neural networks. By carefully analyzing and understanding the architecture of an RNN, however, we find three points of an RNN which may be made deeper; (1) input-to-hidden function, (2) hidden-to-hidden transition and (3) hidden-to-output function. Based on this observation, we propose two novel architectures of a deep RNN which are orthogonal to an earlier attempt of stacking multiple recurrent layers to build a deep RNN (Schmidhuber, 1992; El Hihi and Bengio, 1996). We provide an alternative interpretation of these deep RNNs using a novel framework based on neural operators. The proposed deep RNNs are empirically evaluated on the tasks of polyphonic music prediction and language modeling. The experimental result supports our claim that the proposed deep RNNs benefit from the depth and outperform the conventional, shallow RNNs.

PUBLICATION RECORD

Publication year
2013
Venue
International Conference on Learning Representations
Publication date
2013-12-20
Fields of study
Mathematics, Computer Science
Identifiers
arXiv 1312.6026
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Recurrent Convolutional Neural Networks for Scene Labeling
2014cited by this paper
Learned-Norm Pooling for Deep Feedforward and Recurrent Neural Networks
2013influential reference
On the number of response regions of deep feed forward networks with piece-wise linear activations
2013influential reference
Maxout Networks
2013cited by this paper
On the importance of initialization and momentum in deep learning
2013cited by this paper
Training and Analysing Deep Recurrent Neural Networks
2013cited by this paper
A New Method for Learning Deep Recurrent Neural Networks
2013cited by this paper
A Primal-Dual Method for Training Recurrent Neural Networks Constrained by the Echo-State Property
2013cited by this paper
On Fast Dropout and its Applicability to Recurrent Networks
2013influential reference
Distributed Representations of Words and Phrases and their Compositionality
2013cited by this paper
Recurrent Convolutional Neural Networks for Scene Parsing
2013cited by this paper
Efficient Estimation of Word Representations in Vector Space
2013influential reference
Revisiting Natural Gradient for Deep Networks
2013cited by this paper
Generating Sequences With Recurrent Neural Networks
2013influential reference
Speech recognition with deep recurrent neural networks
2013cited by this paper
Deep Learning Made Easier by Linear Transformations in Perceptrons
2012cited by this paper
Better Mixing via Deep Representations
2012cited by this paper
Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription
2012cited by this paper
Deep Neural Networks for Acoustic Modeling in Speech Recognition
2012influential reference
Theano: new features and speed improvements
2012cited by this paper
On the difficulty of training recurrent neural networks
2012influential reference
Improving neural networks by preventing co-adaptation of feature detectors
2012cited by this paper
Statistical Language Models Based on Neural Networks
2012cited by this paper
Shallow vs. Deep Sum-Product Networks
2011cited by this paper
Extensions of recurrent neural network language model
2011cited by this paper
Practical Variational Inference for Neural Networks
2011cited by this paper
Deep Sparse Rectifier Neural Networks
2011cited by this paper
Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach
2011cited by this paper
SUBWORD LANGUAGE MODELING WITH NEURAL NETWORKS
2011cited by this paper
Generating Text with Recurrent Neural Networks
2011cited by this paper
The Neural Autoregressive Distribution Estimator
2011cited by this paper
Learning Recurrent Neural Networks with Hessian-Free Optimization
2011cited by this paper
Recurrent neural network based language model
2010cited by this paper
Deep learning via Hessian-free optimization
2010cited by this paper
Deep Belief Networks Are Compact Universal Approximators
2010cited by this paper
A Novel Connectionist System for Unconstrained Handwriting Recognition
2009cited by this paper
Measuring Invariances in Deep Networks
2009cited by this paper
GP-BayesFilters: Bayesian filtering using Gaussian process prediction and observation models
2008cited by this paper
Discovering multiscale dynamical features with hierarchical Echo State Networks
2008cited by this paper
Learning Deep Architectures for AI
2007cited by this paper
Supporting Online Material for Reducing the Dimensionality of Data with Neural Networks
2006cited by this paper
An Unsupervised Ensemble Learning Method for Nonlinear Dynamic State-Space Models
2002cited by this paper
Hierarchical Recurrent Neural Networks for Long-Term Dependencies
1995cited by this paper
Learning long-term dependencies with gradient descent is difficult
1994cited by this paper
Building a Large Annotated Corpus of English: The Penn Treebank
1993cited by this paper
Learning Complex, Extended Sequences Using the Principle of History Compression
1992cited by this paper
Multilayer feedforward networks are universal approximators
1989cited by this paper
Learning representations by back-propagating errors
1986cited by this paper

CITED BY

Stability properties of Minimal Gated Unit neural networks
2026cites this paper
Decentralized Learning Strategies for Estimation Error Minimization with Graph Neural Networks
2026cites this paper
Artificial intelligence for microbiology and microbiome research.
2026cites this paper
Sequential Reservoir Computing for Efficient High-Dimensional Spatiotemporal Forecasting
2026cites this paper
Adaptive Output Feedback Control Using Lyapunov-Based Deep Recurrent Neural Networks (Lb-DRNNs)
2026cites this paper
Nonlinear dynamic modeling of turbojet engines using combined convolutional and long short-term memory networks
2025cites this paper
MultiFAR: Multidimensional information fusion with attention-driven representation learning for student performance prediction
2025cites this paper
Deep Carbon: A Multiscale Feature‐Time Fusion Approach for Field Level Digital Soil Organic Carbon Mapping
2025cites this paper
The Arithmetic Optimization Algorithm based Forecasting System for Stock Prices
2025cites this paper
Integrating wavelet transformation for end-to-end direct signal classification
2025cites this paper
Interpretable large language models for early prediction of antimicrobial multidrug resistance
2025cites this paper
Hybrid-driven meta-learning for wind field prediction in unmanned sailboat applications
2025cites this paper
Control-Oriented System Identification: Classical, Learning, and Physics-Informed Approaches
2025influential citation
Improved railway track faults detection using Mel-frequency cepstral coefficient and constant-Q transform features
2025cites this paper
Comparative analysis of RNN, LSTM and CNN algorithms for marine data prediction
2025cites this paper
Multi-branch LSTM encoded latent features with CNN-LSTM for Youtube popularity prediction
2025influential citation
RSIC-GMamba: A State-Space Model With Genetic Operations for Remote Sensing Image Captioning
2025cites this paper
Cyber Attack Prediction: From Traditional Machine Learning to Generative Artificial Intelligence
2025cites this paper
RAT-CC: A Recurrent Autoencoder for Time-Series Compression and Classification
2025cites this paper
Deep-Learning Based Aspect Sentiment Classification for Low-Resource Language
2025cites this paper
An Artificial Trend Index for Private Consumption Using Google Trends
2025cites this paper
Prediction of Sonic Well Logs Using Deep Neural Network: Application to Petroleum Reservoir Characterization in Mexico
2025cites this paper
A Comprehensive Categorization and Comparative Analysis of Methodologies in Aspect-Based Sentiment Analysis
2025cites this paper
An AI-Based Framework for Detection of Toxicity Levels from Oil and Gas Flaring from Video Streams
2025cites this paper
Artificial Intelligence in Data Science: Evaluating Forecasting Models for Solar Energy in the Amazon Basin
2025cites this paper
Predicting Preference with Sparse Data in Personalized Gamification via Deep Learning
2025cites this paper
Adaptive Control via Lyapunov-Based Deep Long Short-Term Memory Networks
2025cites this paper
Memetic Arithmetic Optimization Algorithm and Associated Network Forecasting System for Stock Prices
2025cites this paper
3D Pedestrian Trajectory Prediction using Deep Learning from Kinect Data
2025cites this paper
Torque Prediction of Internal Combustion Engines Based On Time Series
2025cites this paper
Application of Machine Learning in LC-MS-Based Non-Targeted Analysis
2025cites this paper
The Role of Deep Learning in Modern Malware Detection: A Comparative Study of Modern Approaches
2025cites this paper
Evaluating the impact of music tempo on drivers and their performance using an artificial intelligence model: a multi-source data approach
2025cites this paper
An Empirical Study on Bidirectional Recurrent Neural Networks for Human Motion Recognition
2025cites this paper
Exploring the Role of Artificial Intelligence and Machine Learning in Process Optimization for Chemical Industry
2025cites this paper
When Do Transformers Outperform Feedforward and Recurrent Networks? A Statistical Perspective
2025cites this paper
Memory Capacity of Nonlinear Recurrent Networks: Is it Informative?
2025cites this paper
COMPOL: A Unified Neural Operator Framework for Scalable Multi-Physics Simulations
2025cites this paper
A Review of Deep Learning-based Power Load Forecasting Methods
2025cites this paper
Intelligent Traffic Flow Prediction Using Deep Learning Techniques: A Comparative Study
2025cites this paper
Landslide Susceptibility Assessment Using Recurrent Neural Network (RNN)—A Case of Chabahar and Konarak in Iran
2025cites this paper
Exploring Various Sequential Learning Methods for Deformation History Modeling
2025cites this paper
Reconstructing noisy gene regulation dynamics using extrinsic-noise-driven neural stochastic differential equations
2025cites this paper
ADVANCED DETECTION METHODS AND MACHINE LEARNING ANALYSIS OF TEMPORAL AND SPATIAL PATTERNS OF EQUATORIAL PLASMA BUBBLE DEPTH
2025cites this paper
Artificial Intelligence-Guided Supervised Learning Models for Photocatalysis in Wastewater Treatment
2025cites this paper
Forecasting Sesame Price in Ethiopia Using Recurrent Neural Network Based Deep Learning Algorithms
2025cites this paper
Smart Road Traffic Monitoring: Unveiling the Synergy of IoT and AI for Enhanced Urban Mobility
2025cites this paper
Exploration of the Application of Data-Driven and Generation Models in the Design of Thermoplastic Vulcanizate Rubbers
2025cites this paper
Advancing Skin Cancer Detection with CNNs, and Max Pooling Layers
2025cites this paper
PDRNN: Modular Data-driven Pedestrian Dead Reckoning on Loosely Coupled Radio- and Inertial-Signalstreams
2025cites this paper
A Framework for User Traffic Prediction and Resource Allocation in 5G Networks
2025cites this paper
Detecting Harmful Vibrations in the Drilling Industry Through Time Series Classification
2024cites this paper
Towards Maps of Disease Progression: Biomedical Large Language Model Latent Spaces For Representing Disease Phenotypes And Pseudotime
2024cites this paper
Time Series Generative Learning with Application to Brain Imaging Analysis
2024cites this paper
Hack Me If You Can: Aggregating AutoEncoders for Countering Persistent Access Threats Within Highly Imbalanced Data
2024cites this paper
Enhancing Food Authentication through E-Nose and E-Tongue Technologies: Current Trends and Future Directions
2024cites this paper
Investigation of Machine Learning-based Software Definition Network for Intrusion Detection in IoT
2024cites this paper
Machine Learning in Short-Reach Optical Systems: A Comprehensive Survey
2024cites this paper
A Novel Deep Learning Model for Student Performance Prediction Using Engagement Data
2024cites this paper
EchoSpike Predictive Plasticity: An Online Local Learning Rule for Spiking Neural Networks
2024cites this paper
Multi-objective Evolutionary Neural Architecture Search for Recurrent Neural Networks
2024influential citation
Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities
2024cites this paper
A review: Applications of machine learning and deep learning in aerospace engineering and aero-engine engineering
2024cites this paper
Review of Machine learning: Views, Architectures or Techniques, Challenges and Future guidance and Real-world applications
2024cites this paper
Intelligent deep learning techniques for energy consumption forecasting in smart buildings: a review
2024cites this paper
A BERT-based sequential deep neural architecture to identify contribution statements and extract phrases for triplets from scientific publications
2024cites this paper
Do Transformer World Models Give Better Policy Gradients?
2024cites this paper
Towards machine-learning driven prognostics and health management of Li-ion batteries. A comprehensive review
2024cites this paper
Unlocking the Potential of Spiking Neural Networks: Understanding the What, Why, and Where
2024cites this paper
Detecting Thyroid Disease Using Optimized Machine Learning Model Based on Differential Evolution
2024cites this paper
Nonlinear Regression With Hierarchical Recurrent Neural Networks Under Missing Data
2024cites this paper
Distributed Split Learning for Map-Based Signal Strength Prediction Empowered by Deep Vision Transformer
2024cites this paper
DCU: Unidimensional Data Prediction with Graphical Model
2024cites this paper
Inter-layer Recurrent Neural Networks Learning Algorithms
2024cites this paper
A Deep Learning Approach for State of Health Estimation in Lithium-ion Batteries
2024cites this paper
Review of Deep Learning Methods
2024cites this paper
Hierarchical Weighted LSTM with One-class Classifier for Preventive Protection of Cultural Heritage in Museums
2024cites this paper
Deep Sequence Models for Time Series Data: A Comparative Study and Parameter Fine-Tuning Approach
2024influential citation
Personalized Blood Glucose Forecasting From Limited CGM Data Using Incrementally Retrained LSTM
2024cites this paper
Cost aware LSTM model for predicting hard disk drive failures based on extremely imbalanced S.M.A.R.T. sensors data
2024cites this paper
Leveraging survival analysis in cost-aware deepnet for efficient hard drive failure prediction
2024cites this paper
Structural Positional Encoding for Knowledge Integration in Transformer-based Medical Process Monitoring
2024cites this paper
Crispr-SGRU: Prediction of CRISPR/Cas9 Off-Target Activities with Mismatches and Indels Using Stacked BiGRU
2024cites this paper
Multi-Dimensional Framework for EEG Signal Processing and Denoising Through Tensor-based Architecture
2024cites this paper
Deep Learning-Based Code Index Cyclic Shift Keying Spread Spectrum Underwater Acoustic Communication
2024cites this paper
Bidirectional Stackable Recurrent Generative Adversarial Imputation Network for Specific Emitter Missing Data Imputation
2024cites this paper
Predictive modeling of nonlinear system responses using the Residual Improvement Deep Learning Algorithm (RIDLA)
2024cites this paper
Convergence of Gradient Descent for Recurrent Neural Networks: A Nonasymptotic Analysis
2024cites this paper
Comparative study of sequence-to-sequence models: From RNNs to transformers
2024cites this paper
Cuffless blood pressure estimation from photoplethysmography using deep convolutional neural network and transfer learning
2024cites this paper
LanguageFlow: Advancing Diffusion Language Generation with Probabilistic Flows
2024cites this paper
Deep Learning for Satellite Image Time-Series Analysis: A review
2024cites this paper
A Multiscale Grouping Transformer With CLIP Latents for Remote Sensing Image Captioning
2024cites this paper
Machine Learning-Enhanced Forecasting for Efficient Water-Flooded Reservoir Management
2024cites this paper
The transformative potential of AI-driven CRISPR-Cas9 genome editing to enhance CAR T-cell therapy
2024cites this paper
Ensemble learning approach for distinguishing human and computer-generated Arabic reviews
2024cites this paper
Deep Recurrent Stochastic Configuration Networks for Modelling Nonlinear Dynamic Systems
2024cites this paper
Role of social capital and financial inclusion in sustainable economic growth
2024cites this paper
Brain-Inspired Computing: A Systematic Survey and Future Trends
2024cites this paper
A gap filling method for daily evapotranspiration of global flux data sets based on deep learning
2024cites this paper