Long Short-Term Memory-Networks for Machine Reading

Published 2016 in Conference on Empirical Methods in Natural Language Processing

ABSTRACT

In this paper we address the question of how to render sequence-level networks better at handling structured input. We propose a machine reading simulator which processes text incrementally from left to right and performs shallow reasoning with memory and attention. The reader extends the Long Short-Term Memory architecture with a memory network in place of a single memory cell. This enables adaptive memory usage during recurrence with neural attention, offering a way to weakly induce relations among tokens. The system is initially designed to process a single sequence but we also demonstrate how to integrate it with an encoder-decoder architecture. Experiments on language modeling, sentiment analysis, and natural language inference show that our model matches or outperforms the state of the art.

PUBLICATION RECORD

Publication year
2016
Venue
Conference on Empirical Methods in Natural Language Processing
Publication date
2016-01-25
Fields of study
Computer Science
Identifiers
DOI 10.18653/v1/D16-1053 arXiv 1601.06733
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Recurrent Memory Network for Language Modeling
2016cited by this paper
Dynamic Memory Networks for Visual and Textual Question Answering
2016cited by this paper
Learning to Compose Neural Networks for Question Answering
2016cited by this paper
Recurrent Memory Networks for Language Modeling
2016cited by this paper
A Fast Unified Model for Parsing and Sentence Understanding
2016cited by this paper
A large annotated corpus for learning natural language inference
2015cited by this paper
A Neural Attention Model for Abstractive Sentence Summarization
2015cited by this paper
Reasoning about Entailment with Neural Attention
2015influential reference
Molding CNNs for text: non-linear, non-consecutive convolutions
2015influential reference
Ask Me Anything: Dynamic Memory Networks for Natural Language Processing
2015cited by this paper
Learning to Transduce with Unbounded Memory
2015cited by this paper
Gated Feedback Recurrent Neural Networks
2015cited by this paper
A Deep Memory-based Architecture for Sequence-to-Sequence Learning
2015cited by this paper
Depth-Gated Recurrent Neural Networks
2015cited by this paper
End-To-End Memory Networks
2015influential reference
Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks
2015cited by this paper
Teaching Machines to Read and Comprehend
2015cited by this paper
Transition-Based Dependency Parsing with Stack Long Short-Term Memory
2015cited by this paper
Injecting Logical Background Knowledge into Embeddings for Relation Extraction
2015cited by this paper
Learning Natural Language Inference with LSTM
2015cited by this paper
Depth-Gated LSTM
2015cited by this paper
Grid Long Short-Term Memory
2015cited by this paper
Neural Machine Translation by Jointly Learning to Align and Translate
2014influential reference
Deep Recursive Neural Networks for Compositionality in Language
2014cited by this paper
Convolutional Neural Networks for Sentence Classification
2014cited by this paper
Adam: A Method for Stochastic Optimization
2014cited by this paper
A Clockwork RNN
2014influential reference
Distributed Representations of Sentences and Documents
2014cited by this paper
Memory Networks
2014cited by this paper
GloVe: Global Vectors for Word Representation
2014cited by this paper
A Convolutional Neural Network for Modelling Sentences
2014cited by this paper
Learning to Execute
2014cited by this paper
RECURRENT NEURAL NETWORKS
2014cited by this paper
Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation
2014cited by this paper
Generating Sequences With Recurrent Neural Networks
2013influential reference
Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
2013cited by this paper
A study of the knowledge base requirements for passing an elementary science test
2013cited by this paper
On the difficulty of training recurrent neural networks
2012cited by this paper
DeepDive: Web-scale Knowledge-base Construction using Statistical Learning and Inference
2012cited by this paper
Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection
2011cited by this paper
Open Information Extraction: The Second Generation
2011cited by this paper
Insensitivity of the Human Sentence-Processing System to Hierarchical Structure
2011cited by this paper
Knowledge Base Population: Successful Approaches and Challenges
2011cited by this paper
Learning Structured Embeddings of Knowledge Bases
2011cited by this paper
Toward an Architecture for Never-Ending Language Learning
2010cited by this paper
Filling Knowledge Gaps in Text for Machine Reading
2010cited by this paper
Machine Reading: A "Killer App" for Statistical Relational AI
2010cited by this paper
Unsupervised Ontology Induction from Text
2010cited by this paper
Recurrent neural network based language model
2010cited by this paper
Eye movements and non-canonical reading: comments on.
2009cited by this paper
Machine Reading
2006cited by this paper
The PASCAL Recognising Textual Entailment Challenge
2005cited by this paper
Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency
2004cited by this paper
Extracting and evaluating general world knowledge from the Brown Corpus
2003cited by this paper
Locality and Parsing Complexity
2000cited by this paper
Eye movements in reading and information processing: 20 years of research.
1998cited by this paper
Long Short-Term Memory
1997cited by this paper
Integration of visual and linguistic information in spoken language comprehension.
1995cited by this paper
Learning long-term dependencies with gradient descent is difficult
1994cited by this paper
Learning Context-free Grammars: Capabilities and Limitations of a Recurrent Neural Network with an External Stack Memory (cid:3)
1992cited by this paper
On the computational power of neural nets
1992cited by this paper
Untersuchungen zu dynamischen neuronalen Netzen
1991cited by this paper
Recovery from misanalyses of garden-path sentences ☆
1991cited by this paper
A computational model of human parsing
1989cited by this paper
Learning Context-free Grammars
year unknowncited by this paper

CITED BY

Detecting Transportation Mode Using Dense Smartphone GPS Trajectories and Transformer Models
2026cites this paper
Multi-subgraph fusion: an innovative approach for block matrix graph convolutional networks
2026cites this paper
EGAM: Extended Graph Attention Model for Solving Routing Problems
2026cites this paper
Interpretable Sentiment Analysis Using the Attention-Based Multiple Instance Classification Model: An Application to Wine Reviews
2025cites this paper
Hybrid Attention-Based Residual Network for Image Classification
2025cites this paper
Long-Interval Spatio-Temporal Graph Convolution for Brain Disease Diagnosis
2025cites this paper
Hybrid Deep Learning Framework for Short-Term Electricity Generation Forecasting in Türkiye Using Multi-Source Data
2025cites this paper
Comparison of language models for wine sentiment analysis
2025cites this paper
DMMSA net: a dual multi-modal self-attention network for lung target areas segmentation using chest radiology images and reports
2025cites this paper
Enhancing cyberbullying identification with ELECTRA-BiLSTM: A hybrid approach for improved contextual and sequential understanding
2025cites this paper
How Do Papers Make Into Machine Learning Frameworks: a Preliminary Study on Tensorflow
2025cites this paper
Dual-view cross attention enhanced semi-supervised learning method for discourse cognitive engagement classification in online course discussions
2025cites this paper
Building clustering method that integrates graph attention networks and spectral clustering
2025cites this paper
A deep learning approach to understanding controlled ovarian stimulation and in vitro fertilization dynamics
2025cites this paper
Transformer Copilot: Learning from The Mistake Log in LLM Fine-tuning
2025cites this paper
Enhancing Speech Emotion Recognition Through Integrated Audio-Video Analysis
2025cites this paper
The Origin of Self-Attention: Pairwise Affinity Matrices in Feature Selection and the Emergence of Self-Attention
2025cites this paper
Classification of cyber attacks in IoMT networks using deep learning: a comparative study
2025cites this paper
Volatility and skewness predictability with deep learning and big data: Chinese futures market case
2025cites this paper
Towards Detecting Infinite Combos in Collectible Card Game
2025cites this paper
Dependability and Protection of Transformer Models Against Soft Errors on Text Embeddings
2025cites this paper
MTMed3D: A Multi-Task Transformer-Based Model for 3D Medical Imaging
2025cites this paper
Multiperson creation network based on GAN for text-to-image generation
2025cites this paper
The epistemological consequences of large language models: rethinking collective intelligence and institutional knowledge
2025cites this paper
Dal-yolo: a multi-target detection model for UAV-based road maintenance integrating feature pyramid and attention mechanisms
2025cites this paper
Comparative Study on Curiosity with Attention, Memory and Empowerment
2025cites this paper
Word embedding factor based multi-head attention
2025cites this paper
A systematic assessment of sentiment analysis models on iraqi dialect-based texts
2025cites this paper
Hate Speech Detection in Low-Resource Languages Hindi, Gujarati, Marathi, and Sinhala
2025cites this paper
Towards Understanding Distilled Reasoning Models: A Representational Approach
2025cites this paper
Learning From Crowds Using Graph Neural Networks With Attention Mechanism
2025cites this paper
A Survey on fMRI-based Brain Decoding for Reconstructing Multimodal Stimuli
2025cites this paper
A comparative study of deep learning and ensemble learning to extend the horizon of traffic forecasting
2025cites this paper
Deep Reinforcement Learning for CT-Based Non-Invasive Prediction of SOX9 Expression in Hepatocellular Carcinoma
2025cites this paper
A Neural Network-Assisted GLRT Detection Algorithm for Range-Spread Target
2025cites this paper
Genomic and Health Data as Fuel to Advance a Health Data Economy for Artificial Intelligence
2025cites this paper
Learning Robust Satellite Attitude Dynamics with Physics-Informed Normalising Flow
2025cites this paper
MSF-DFormer: A Multisensor Multiscale Fusion Network With Deformable Transformer for Fault Diagnosis Under Complex Working Conditions With Limited Samples
2025cites this paper
iFuzzyTL: Interpretable Fuzzy Transfer Learning for Steady-State Visual Evoked Potentials Brain–Computer Interfaces System
2025cites this paper
Foundations of Artificial Intelligence Frameworks: Notion and Limits of AGI
2025cites this paper
Spatio-temporal Transfer Learning for Urban Data Modeling
2025cites this paper
BLUE: Bi-layer Heterogeneous Graph Fusion Network for Avian Influenza Forecasting
2025cites this paper
Emotion and Intensity Prediction Using Hybrid Multi-Task BERT_BiLSTM Deep Learning Model
2025cites this paper
Ensemble-Based Survival Models with the Self-Attended Beran Estimator Predictions
2025cites this paper
Retrospective State Fusion with Query-Guided Weighting for Long-Range Epidemic Forecasting
2025cites this paper
LLMSR: Sequential Recommendation Based on Product Quantization Adapted to Large Language Models
2025cites this paper
Robotic Grasp Detection via Residual Efficient Channel Attention and Multiscale Feature Learning
2025cites this paper
Fast weight programming and linear transformers: from machine learning to neurobiology
2025cites this paper
Spatial-temporal fuzzy cognitive maps based on adaptive graph learning for multivariate time series interpretable prediction
2025cites this paper
Dynamics is what you need for time-series forecasting!
2025cites this paper
Neural Networks based Identification of Psychological Health Status of College Students
2025cites this paper
Improving prompt tuning-based software vulnerability assessment by fusing source code and vulnerability description
2025cites this paper
Attention-based Graph Clustering Network with Dual Information Interaction
2025cites this paper
Novel Multi-Scale Attention Generative Adversarial Network for Photovoltaic Solar Cell Defect Inspection Using Electroluminescence Images
2025cites this paper
Optimization of PML Layer Operations Based on LSTM for DGTD Algorithms
2025cites this paper
A road lane detection approach based on reformer model
2025cites this paper
Optimized Deep Feature Analysis for Enhanced Botnet Attack Prediction in IoT Networks
2025cites this paper
Selective Graph Convolutional Network for Efficient Routing
2024cites this paper
Prepared for Lift-Off: Hybrid CNN-LSTM Architecture for Aircraft Engine Remaining Useful Life Estimation
2024cites this paper
Research on text generation model based on natural language processing
2024cites this paper
How neural networks work: Unraveling the mystery of randomized neural networks for functions and chaotic dynamical systems.
2024cites this paper
An Automated Self-Attention Enhanced Deep Learning Model for Dental Caries Detection
2024cites this paper
Automatic Deceit Detection Through Multimodal Analysis of High-Stake Court-Trials
2024cites this paper
Why Are Positional Encodings Nonessential for Deep Autoregressive Transformers? Revisiting a Petroglyph
2024cites this paper
Toward Visual Grounding: A Survey
2024cites this paper
A novel shilling attack on black-box recommendation systems for multiple targets
2024cites this paper
Enhancing Time Series Product Demand Forecasting With Hybrid Attention-Based Deep Learning Models
2024cites this paper
Advancements in Natural Language Processing for Automatic Text Summarization
2024cites this paper
The Evolution and Breakthrough of Natural Language Processing: The Revolution from Rules to Deep Learning
2024cites this paper
Graph Fuzzy Attention Network Model for Metastasis Prediction of Prostate Cancer Based on mRNA Expression Data
2024cites this paper
Transformer Based Ensemble Framework For Sequential User Behavior Prediction
2024cites this paper
ECG signal classification model based on multi-modal images and fusion attention mechanism
2024cites this paper
Generative Spatio-temporal GraphNet for Transonic Wing Pressure Distribution Forecasting
2024cites this paper
Spatioformer: A Geo-Encoded Transformer for Large-Scale Plant Species Richness Prediction
2024cites this paper
A comprehensive survey on pretrained foundation models: a history from BERT to ChatGPT
2024cites this paper
SAITI-DCGAN: Self-Attention Based Deep Convolutional Generative Adversarial Networks for Data Augmentation of Infrared Thermal Images
2024cites this paper
Automatic classification of text messages by confidentiality level based on ensemble of artificial neural networks
2024cites this paper
Research on DGA Domain Name Detection Under The Condition of Unbalanced Sample Ratio
2024cites this paper
KODA: A Data-Driven Recursive Model for Time Series Forecasting and Data Assimilation using Koopman Operators
2024cites this paper
Leveraging Long-Context Large Language Models for Multi-Document Understanding and Summarization in Enterprise Applications
2024cites this paper
Stratified Domain Adaptation: A Progressive Self-Training Approach for Scene Text Recognition
2024cites this paper
On memory-augmented gated recurrent unit network
2024cites this paper
Domain-knowledge enabled ensemble learning of 5-formylcytosine (f5C) modification sites
2024cites this paper
Machine Learning-Based Election Results Prediction Using Twitter Activity
2024cites this paper
Attention-Based Difficulty Feature Enhancement for Knowledge Tracing
2024cites this paper
iFuzzyTL: Interpretable Fuzzy Transfer Learning for SSVEP BCI System
2024cites this paper
Automatic Pull Request Description Generation Using LLMs: A T5 Model Approach
2024cites this paper
Brain Image Synthesis Using Incomplete Multimodal Data
2024cites this paper
Attention model based on one-dimensional residual convolutional autoencoder and GRU for knowledge tracing
2024cites this paper
Federated Recommender System Based on Diffusion Augmentation and Guided Denoising
2024cites this paper
Long COVID Discourse in Canada, the United States, and Europe: Topic Modeling and Sentiment Analysis of Twitter Data
2024cites this paper
A Fuzzy Logic-Based Approach to Predict Human Interaction by Functional Near-Infrared Spectroscopy
2024cites this paper
Automatic title completion for Stack Overflow posts and GitHub issues
2024cites this paper
Evaluation of Cooperative Perception Algorithms in Simulation Environments with Realistic Communication Models
2024cites this paper
CAMAOT: Channel-Aware Multi-Camera Active Object Tracking System
2024cites this paper
Streamlining latent spaces in machine learning using moment pooling
2024cites this paper
DHPrep: Deep Hawkes Process based Dynamic Network Representation
2024cites this paper
Influence maximization in evolving graphs using reinforcement learning
2024cites this paper
Epidemiology-informed Graph Neural Network for Heterogeneity-aware Epidemic Forecasting
2024cites this paper
Multi-step ahead prediction of lake water temperature using neural network and physically-based model
2024cites this paper