Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical machine translation, the neural machine translation aims at building a single neural network that can be jointly tuned to maximize the translation performance. The models proposed recently for neural machine translation often belong to a family of encoder-decoders and consists of an encoder that encodes a source sentence into a fixed-length vector from which a decoder generates a translation. In this paper, we conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and propose to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly. With this new approach, we achieve a translation performance comparable to the existing state-of-the-art phrase-based system on the task of English-to-French translation. Furthermore, qualitative analysis reveals that the (soft-)alignments found by the model agree well with our intuition.
Neural Machine Translation by Jointly Learning to Align and Translate
Dzmitry Bahdanau,Kyunghyun Cho,Yoshua Bengio
Published 2014 in International Conference on Learning Representations
ABSTRACT
PUBLICATION RECORD
- Publication year
2014
- Venue
International Conference on Learning Representations
- Publication date
2014-09-01
- Fields of study
Mathematics, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
LINKED PAPERS
- On the Properties of Neural Machine Translation: Encoder–Decoder Approaches
- rnn encoder--decoder is a · An RNN Encoder--Decoder is a recurrent neural machine translation model that follows the encoder-decoder architecture of encoding a source sentence and decoding a target translation.
- neural machine translation related to · Both neural machine translation concepts describe the same neural-network-based translation approach that maps a source sentence to a target sentence.
CLAIMS
CONCEPTS
- encoder-decoder architecture
A neural model structure that encodes a source sentence and then decodes a target translation from the encoded representation.
Aliases: encoder-decoder
뀨 (7c402c1b98) extractionAnonymous (12632b8b5f) review - english-to-french translation
The translation task that maps English source sentences to French target sentences.
뀨 (7c402c1b98) extractionAnonymous (12632b8b5f) review - fixed-length vector
A single vector used to summarize the entire source sentence in the basic encoder-decoder model.
뀨 (7c402c1b98) extractionAnonymous (12632b8b5f) review - neural machine translation
A machine translation approach that uses a single neural network to model translation from source to target.
뀨 (7c402c1b98) extractionAnonymous (12632b8b5f) review - phrase-based machine translation system
A phrase-based statistical machine translation baseline used as the comparison system.
Aliases: phrase-based system
뀨 (7c402c1b98) extractionAnonymous (12632b8b5f) review - soft alignment
A learned mechanism that assigns soft relevance across source-sentence positions while predicting each target word.
Aliases: soft-search, soft-alignments
뀨 (7c402c1b98) extractionAnonymous (12632b8b5f) review
REFERENCES
Showing 1-29 of 29 references · Page 1 of 1