Recurrent Memory Networks for Language Modeling

Ke M. Tran,Arianna Bisazza,Christof Monz

Published 2016 in North American Chapter of the Association for Computational Linguistics

ABSTRACT

Recurrent Neural Networks (RNN) have obtained excellent result in many natural language processing (NLP) tasks. However, understanding and interpreting the source of this success remains a challenge. In this paper, we propose Recurrent Memory Network (RMN), a novel RNN architecture, that not only amplifies the power of RNN but also facilitates our understanding of its internal functioning and allows us to discover underlying patterns in data. We demonstrate the power of RMN on language modeling and sentence completion tasks. On language modeling, RMN outperforms Long Short-Term Memory (LSTM) network on three large German, Italian, and English dataset. Additionally we perform in-depth analysis of various linguistic dimensions that RMN captures. On Sentence Completion Challenge, for which it is essential to capture sentence coherence, our RMN obtains 69.2% accuracy, surpassing the previous state-of-the-art by a large margin.

PUBLICATION RECORD

Publication year
2016
Venue
North American Chapter of the Association for Computational Linguistics
Publication date
2016-01-06
Fields of study
Linguistics, Computer Science
Identifiers
DOI 10.18653/v1/N16-1036
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Transition-Based Dependency Parsing with Stack Long Short-Term Memory
2015cited by this paper
Strategies for Training Large Vocabulary Neural Language Models
2015cited by this paper
End-To-End Memory Networks
2015influential reference
Findings of the 2015 Workshop on Statistical Machine Translation
2015cited by this paper
DRAW: A Recurrent Neural Network For Image Generation
2015cited by this paper
Sentence Compression by Deletion with LSTMs
2015cited by this paper
Tree-Structured Composition in Neural Networks without Tree-Structured Architectures
2015cited by this paper
An Empirical Exploration of Recurrent Network Architectures
2015influential reference
Dependency Recurrent Neural Language Models for Sentence Completion
2015cited by this paper
Grid Long Short-Term Memory
2015cited by this paper
LSTM: A Search Space Odyssey
2015cited by this paper
Visualizing and Understanding Recurrent Networks
2015influential reference
Less is More? Towards a Reduced Inventory of Categories for Training a Parser for the Italian Stanford Dependencies
2014cited by this paper
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
2014influential reference
Dropout: a simple way to prevent neural networks from overfitting
2014cited by this paper
Sequence to Sequence Learning with Neural Networks
2014cited by this paper
The PAISÀ Corpus of Italian Web Texts
2014cited by this paper
Neural Machine Translation by Jointly Learning to Align and Translate
2014cited by this paper
On the Properties of Neural Machine Translation: Encoder–Decoder Approaches
2014cited by this paper
Neural Turing Machines
2014cited by this paper
Training and Analysing Deep Recurrent Neural Networks
2013cited by this paper
Dropout Improves Recurrent Neural Networks for Handwriting Recognition
2013cited by this paper
Learning word embeddings efficiently with noise-contrastive estimation
2013cited by this paper
Efficient Estimation of Word Representations in Vector Space
2013influential reference
One billion word benchmark for measuring progress in statistical language modeling
2013cited by this paper
Exploiting Synergies Between Open Resources for German Dependency Parsing, POS-tagging, and Morphological Analysis
2013cited by this paper
A Challenge Set for Advancing Language Modeling
2012influential reference
On the difficulty of training recurrent neural networks
2012cited by this paper
Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions
2011cited by this paper
Recurrent neural network based language model
2010cited by this paper
Accurate Dependency Parsing with a Stacked Multilayer Perceptron
2009cited by this paper
Eine umfassende Constraint-Dependenz-Grammatik des Deutschen
2006cited by this paper
A Neural Probabilistic Language Model
2003cited by this paper
Long Short-Term Memory
1997influential reference
Learning long-term dependencies with gradient descent is difficult
1994cited by this paper
Building a Large Annotated Corpus of English: The Penn Treebank
1993cited by this paper
Finding Structure in Time
1990cited by this paper

CITED BY

Current applications and future directions in natural language processing for news media and mental health
2025cites this paper
Segmented Recurrent Transformer With Cubed 3-D-Multiscanning Strategy for Hyperspectral Image Classification
2024cites this paper
A Comparative Study of Long Short-Term Memory and Gated Recurrent Unit
2023cites this paper
Foundation Metrics: Quantifying Effectiveness of Healthcare Conversations powered by Generative AI
2023cites this paper
Multiscanning-Based RNN–Transformer for Hyperspectral Image Classification
2023cites this paper
TSA-CNN-AOA: Twitter sentiment analysis using CNN optimized via arithmetic optimization algorithm
2023cites this paper
Foundation metrics for evaluating effectiveness of healthcare conversations powered by generative AI
2023influential citation
SC-Ques: A Sentence Completion Question Dataset for English as a Second Language Learners
2022cites this paper
An Augmented Neural Network for Sentiment Analysis Using Grammar
2022cites this paper
GAGIN: generative adversarial guider imputation network for missing data
2022cites this paper
Classifying Imbalanced Data with AUM Loss
2022cites this paper
TranGRU: focusing on both the local and global information of molecules for molecular property prediction
2022cites this paper
A Hybrid Deep Learning Model Using Grid Search and Cross-Validation for Effective Classification and Prediction of Suicidal Ideation from Social Network Data
2022cites this paper
Keep Me Updated! Memory Management in Long-term Conversations
2022cites this paper
Improving Sequential Recommendation via Subsequence Extraction
2022cites this paper
Self-Instantiated Recurrent Units with Dynamic Soft Recursion
2021cites this paper
Automatic identification of focus personage in multi-lingual news images
2021cites this paper
Adversarial Memory Networks for Action Prediction
2021cites this paper
Hierarchical BERT with an adaptive fine-tuning strategy for document classification
2021cites this paper
Auto-Text Completion Framework to Improve Online Conversation
2021cites this paper
Philipp Koehn: Neural Machine Translation
2021cites this paper
High-Resolution Remote Sensing Image Captioning Based on Structured Attention
2021cites this paper
Learning Better Sentence Representation with Syntax Information
2021cites this paper
Event prediction based on evolutionary event ontology knowledge
2021cites this paper
Pattern Recognition: 5th Asian Conference, ACPR 2019, Auckland, New Zealand, November 26–29, 2019, Revised Selected Papers, Part I
2020influential citation
Assessment of Word-Level Neural Language Models for Sentence Completion
2020influential citation
Combinatorial Code Classification & Vulnerability Rating
2020cites this paper
Chinese Long and Short Form Choice Exploiting Neural Network Language Modeling Approaches
2020cites this paper
Getting More Data for Low-resource Morphological Inflection: Language Models and Data Augmentation
2020cites this paper
Time-aware Large Kernel Convolutions
2020cites this paper
Retrieval Topic Recurrent Memory Network for Remote Sensing Image Captioning
2020cites this paper
NASABN: A Neural Architecture Search Framework for Attention-Based Networks
2020cites this paper
Neural Language Generation: Formulation, Methods, and Evaluation
2020cites this paper
Dynamic Multi-level Attention Models for Dialogue Response Generation
2020cites this paper
Neural Machine Translation
2020cites this paper
Span-Based Neural Buffer: Towards Efficient and Effective Utilization of Long-Distance Context for Neural Sequence Models
2020cites this paper
Exploring Context’s Diversity to Improve Neural Language Model
2019cites this paper
Variational Smoothing in Recurrent Neural Network Language Models
2019cites this paper
State-Regularized Recurrent Neural Networks
2019influential citation
Modeling Recurrence for Transformer
2019cites this paper
Neural Attentions for Natural Language Understanding and Modeling by Hongyin Luo
2019cites this paper
Improving Neural Language Models by Segmenting, Attending, and Predicting the Future
2019cites this paper
A Survey on Neural Network Language Models
2019cites this paper
Log2Intent: Towards Interpretable User Modeling via Recurrent Semantics Memory Unit
2019influential citation
R-Transformer: Recurrent Neural Network Enhanced Transformer
2019cites this paper
Transformer and Multi-scale Convolution for Target-Oriented Sentiment Analysis
2019cites this paper
Reflective Decoding Network for Image Captioning
2019cites this paper
Integrating Video Retrieval and Moment Detection in a Unified Corpus for Video Question Answering
2019cites this paper
An Improved Recurrent Neural Network Language Model for Programming Language
2019cites this paper
Question Answering with Hierarchical Attention Networks
2019cites this paper
A memory enhanced LSTM for modeling complex temporal dependencies
2019cites this paper
Joint Embedding Learning of Educational Knowledge Graphs
2019cites this paper
Memory Graph Networks for Explainable Memory-grounded Question Answering
2019cites this paper
Attention Recurrent Neural Networks for Image-Based Sequence Text Recognition
2019cites this paper
Word RNN as a Baseline for Sentence Completion
2018influential citation
What can we gain from language models for morphological inflection?
2018cites this paper
Continuous Learning in a Hierarchical Multiscale Neural Network
2018cites this paper
13 F eb 2 01 6 Learning Over Long Time Lags
2018cites this paper
The Importance of Being Recurrent for Modeling Hierarchical Structure
2018cites this paper
Extending Output Attentions in Recurrent Neural Networks for Dialog Generation
2018cites this paper
Machine Floriography: Sentiment-inspired Flower Predictions over Gated Recurrent Neural Networks
2018cites this paper
Learning Word Representations with Cross-Sentence Dependency for End-to-End Co-reference Resolution
2018cites this paper
Ranking Quality of Answers Drawn from Independent Evident Passages in Neural Machine Comprehension
2018cites this paper
Keynote: Unveiling the Linguistic Weaknesses of Neural MT
2018cites this paper
Memory Architectures in Recurrent Neural Network Language Models
2018cites this paper
Meta-Learning a Dynamical Language Model
2018cites this paper
LSTM with sentence representations for document-level sentiment classification
2018cites this paper
Active Memory Networks for Language Modeling
2018cites this paper
A QA System Based on Bidirectional LSTM with Text Similarity Calculation Model
2018cites this paper
Predicting and discovering linguistic structure with neural networks
2018cites this paper
Frustratingly Short Attention Spans in Neural Language Modeling
2017influential citation
Recurrent neural network language models for automatic speech recognition
2017cites this paper
Natural language generation as neural sequence learning and beyond
2017cites this paper
A Neural Language Model for Dynamically Representing the Meanings of Unknown Words and Entities in a Discourse
2017cites this paper
Using A Neural Model ∗
2017cites this paper
Integrating Order Information and Event Relation for Script Event Prediction
2017cites this paper
Opinion Recommendation Using A Neural Model
2017cites this paper
Recurrent Neural Network based language modeling with controllable external Memory
2017cites this paper
Code Completion with Neural Attention and Pointer Networks
2017cites this paper
Opinion Recommendation using Neural Memory Model
2017cites this paper
Learning to Remember Translation History with a Continuous Cache
2017cites this paper
Topically Driven Neural Language Model
2017cites this paper
Attentive Memory Networks: Efficient Machine Reading for Conversational Search
2017cites this paper
Dynamic Entity Representations in Neural Language Models
2017cites this paper
Neural Vector Spaces for Unsupervised Information Retrieval
2017cites this paper
Self-Attentive Residual Decoder for Neural Machine Translation
2017cites this paper
Long Short-Term Memory-Networks for Machine Reading
2016cites this paper
Accelerating recurrent neural network training using sequence bucketing and multi-GPU data parallelization
2016cites this paper
Unsupervised Neural Hidden Markov Models
2016influential citation
Cached Long Short-Term Memory Neural Networks for Document-Level Sentiment Classification
2016cites this paper
Learning Python Code Suggestion with a Sparse Pointer Network
2016cites this paper
UvA-DARE (Digital Academic Repository) Unsupervised Neural Hidden Markov Models
year unknowninfluential citation