A Neural Language Model for Dynamically Representing the Meanings of Unknown Words and Entities in a Discourse

Sosuke Kobayashi,Naoaki Okazaki,Kentaro Inui

Published 2017 in International Joint Conference on Natural Language Processing

ABSTRACT

This study addresses the problem of identifying the meaning of unknown words or entities in a discourse with respect to the word embedding approaches used in neural language models. We proposed a method for on-the-fly construction and exploitation of word embeddings in both the input and output layers of a neural model by tracking contexts. This extends the dynamic entity representation used in Kobayashi et al. (2016) and incorporates a copy mechanism proposed independently by Gu et al. (2016) and Gulcehre et al. (2016). In addition, we construct a new task and dataset called Anonymized Language Modeling for evaluating the ability to capture word meanings while reading. Experiments conducted using our novel dataset show that the proposed variant of RNN language model outperformed the baseline model. Furthermore, the experiments also demonstrate that dynamic updates of an output layer help a model predict reappearing entities, whereas those of an input layer are effective to predict words following reappearing entities.

PUBLICATION RECORD

Publication year
2017
Venue
International Joint Conference on Natural Language Processing
Publication date
2017-09-06
Fields of study
Linguistics, Computer Science
Identifiers
arXiv 1709.01679
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Dynamic Entity Representations in Neural Language Models
2017influential reference
Linguistic Knowledge as Memory for Recurrent Neural Networks
2017cited by this paper
Learning to Compute Word Embeddings On the Fly
2017cited by this paper
Matching Networks for One Shot Learning
2016cited by this paper
Learning Global Features for Coreference Resolution
2016influential reference
Dynamic Entity Representation with Max-pooling Improves Machine Reading
2016influential reference
Tracking the World State with Recurrent Entity Networks
2016influential reference
Reference-Aware Language Models
2016influential reference
Language Modeling with Gated Convolutional Networks
2016cited by this paper
Deep Reinforcement Learning for Mention-Ranking Coreference Models
2016influential reference
Two Discourse Driven Language Models for Semantics
2016cited by this paper
Exploring the Limits of Language Modeling
2016cited by this paper
Pointer Sentinel Mixture Models
2016influential reference
Pointing the Unknown Words
2016influential reference
Improving Coreference Resolution by Learning Entity-Level Distributed Representations
2016cited by this paper
Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models
2016cited by this paper
Recurrent Memory Networks for Language Modeling
2016cited by this paper
context2vec: Learning Generic Context Embedding with Bidirectional LSTM
2016cited by this paper
Context-dependent word representation for neural machine translation
2016cited by this paper
Incorporating Copying Mechanism in Sequence-to-Sequence Learning
2016influential reference
Improving Neural Language Models with a Continuous Cache
2016cited by this paper
Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks
2015cited by this paper
Character-Aware Neural Language Models
2015cited by this paper
Neural Machine Translation of Rare Words with Subword Units
2015cited by this paper
End-To-End Memory Networks
2015cited by this paper
Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models
2015cited by this paper
Teaching Machines to Read and Comprehend
2015cited by this paper
Do Multi-Sense Embeddings Improve Natural Language Understanding?
2015cited by this paper
Traversing Knowledge Graphs in Vector Space
2015cited by this paper
Chainer : a Next-Generation Open Source Framework for Deep Learning
2015cited by this paper
Larger-Context Language Modelling with Recurrent Neural Network
2015cited by this paper
Addressing the Rare Word Problem in Neural Machine Translation
2014cited by this paper
Adam: A Method for Stochastic Optimization
2014cited by this paper
Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation
2014influential reference
Neural Machine Translation by Jointly Learning to Align and Translate
2014cited by this paper
Distributed Representations of Words and Phrases and their Compositionality
2013cited by this paper
One billion word benchmark for measuring progress in statistical language modeling
2013cited by this paper
Japanese and Korean voice search
2012cited by this paper
CoNLL-2012 Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes
2012influential reference
Recurrent neural network based language model
2010influential reference
Rectified Linear Units Improve Restricted Boltzmann Machines
2010cited by this paper
One-shot learning of object categories
2006cited by this paper
A Neural Probabilistic Language Model
2003cited by this paper
Long Short-Term Memory
1997cited by this paper

CITED BY

Veri Artırımı için Yarı-Denetimli Bağlamsal Anlam Belirsizliği Giderme
2021cites this paper
On the Embeddings of Variables in Recurrent Neural Networks for Source Code
2021cites this paper
Neural Code Completion with Anonymized Variable Names
2020cites this paper
Knowledge Efficient Deep Learning for Natural Language Processing
2020cites this paper
The Referential Reader: A Recurrent Entity Network for Anaphora Resolution
2019cites this paper
Second-order contexts from lexical substitutes for few-shot learning of word representations
2019cites this paper
Detecting Nonstandard Word Usages on Social Media
2019cites this paper
Improving Pre-Trained Multilingual Model with Vocabulary Expansion
2019cites this paper
Representing Movie Characters in Dialogues
2019cites this paper
Strategies for Structuring Story Generation
2019cites this paper
Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations
2018cites this paper
Combining Mutual Information and Entropy for Unknown Word Extraction from Multilingual Code-Switching Sentences
2018cites this paper
Dynamic Integration of Background Knowledge in Neural NLU Systems
2017cites this paper
NEURAL NETWORKS FOR SOURCE CODE PROCESSING
year unknowncites this paper