Context Gates for Neural Machine Translation

Zhaopeng Tu,Yang Liu,Zhengdong Lu,Xiaohua Liu,Hang Li

Published 2016 in Transactions of the Association for Computational Linguistics

ABSTRACT

In neural machine translation (NMT), generation of a target word depends on both source and target contexts. We find that source contexts have a direct impact on the adequacy of a translation while target contexts affect the fluency. Intuitively, generation of a content word should rely more on the source context and generation of a functional word should rely more on the target context. Due to the lack of effective control over the influence from source and target contexts, conventional NMT tends to yield fluent but inadequate translations. To address this problem, we propose context gates which dynamically control the ratios at which source and target contexts contribute to the generation of target words. In this way, we can enhance both the adequacy and fluency of NMT with more careful control of the information flow from contexts. Experiments show that our approach significantly improves upon a standard attention-based NMT system by +2.3 BLEU points.

PUBLICATION RECORD

Publication year
2016
Venue
Transactions of the Association for Computational Linguistics
Publication date
2016-08-22
Fields of study
Computer Science
Identifiers
DOI 10.1162/tacl_a_00048 arXiv 1608.06043
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Modeling Coverage for Neural Machine Translation
2016influential reference
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
2015influential reference
Larger-Context Language Modelling with Recurrent Neural Network
2015influential reference
Effective Approaches to Attention-based Neural Machine Translation
2015cited by this paper
Document Context Language Models
2015cited by this paper
Neural Machine Translation by Jointly Learning to Align and Translate
2014influential reference
On Using Very Large Target Vocabulary for Neural Machine Translation
2014cited by this paper
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
2014cited by this paper
Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation
2014influential reference
Sequence to Sequence Learning with Neural Networks
2014cited by this paper
Recurrent Continuous Translation Models
2013cited by this paper
Context dependent recurrent neural network language model
2012cited by this paper
Fluency, Adequacy, or HTER? Exploring Different Human Judgments with a Tunable MT Metric
2009cited by this paper
Moses: Open Source Toolkit for Statistical Machine Translation
2007cited by this paper
Clause Restructuring for Statistical Machine Translation
2005cited by this paper
A Systematic Comparison of Various Statistical Alignment Models
2003cited by this paper
Bleu: a Method for Automatic Evaluation of Machine Translation
2002cited by this paper
Extensions to HMM-based Statistical Word Alignment Models
2002influential reference
Recurrent nets that time and count
2000cited by this paper
Long Short-Term Memory
1997influential reference
The Mathematics of Statistical Machine Translation: Parameter Estimation
1993cited by this paper

CITED BY

Breaking Bias: A Context-Aware Multimodal Framework for Detecting and Neutralizing Ideological Bias in News
2026cites this paper
You Are What You Train: Effects of Data Composition on Training Context-aware Machine Translation Models
2025cites this paper
Analyzing the Attention Heads for Pronoun Disambiguation in Context-aware Machine Translation Models
2024cites this paper
Study of Translation Strategies for University Students’ English Language Learning in China
2024cites this paper
Sequence Shortening for Context-Aware Machine Translation
2024cites this paper
Context-Aware Attention Layers coupled with Optimal Transport Domain Adaptation methods for recognizing dementia from spontaneous speech
2023cites this paper
Multi-modal adaptive gated mechanism for visual question answering
2023cites this paper
Document-Level Neural Machine Translation With Recurrent Context States
2023cites this paper
Bridging the Gap between Position-Based and Content-Based Self-Attention for Neural Machine Translation
2023cites this paper
Density of States Prediction of Crystalline Materials via Prompt-guided Multi-Modal Transformer
2023cites this paper
Context-aware attention layers coupled with optimal transport domain adaptation and multimodal fusion methods for recognizing dementia from spontaneous speech
2023cites this paper
Understanding and Improving Sequence-to-Sequence Pretraining for Neural Machine Translation
2022cites this paper
Context-Fused Guidance for Image Captioning Using Sequence-Level Training
2022cites this paper
Analysis on Norms of Word Embedding and Hidden Vectors in Neural Conversational Model Based on Encoder-Decoder RNN
2022influential citation
Thinking Hallucination for Video Captioning
2022influential citation
MULTILINGUAL DOCUMENT EMBEDDING WITH SEQUENTIAL NEURAL NETWORK MODELS
2022cites this paper
CAAN: Context-Aware attention network for visual question answering
2022cites this paper
Bilingual attention based neural machine translation
2022cites this paper
Exploring Hypotactic Structure for Chinese-English Machine Translation with a Structure-Aware Encoder-Decoder Neural Model
2022cites this paper
Visual question answering with gated relation-aware auxiliary
2022cites this paper
Divide and Rule: Training Context-Aware Multi-Encoder Translation Models with Little Resources
2021cites this paper
A Novel GCN Architecture for Text Generation from Knowledge Graphs: Full Node Embedded Strategy and Context Gate with Copy and Penalty Mechanism
2021influential citation
Divide and Rule: Effective Pre-Training for Context-Aware Multi-Encoder Translation Models
2021cites this paper
Improving neural machine translation using gated state network and focal adaptive attention networtk
2021cites this paper
Modeling hypotactic structure for Chinese-English neural machine translation of complex sentences
2021cites this paper
Language Translation as a Socio-Technical System:Case-Studies of Mixed-Initiative Interactions
2021cites this paper
Revisiting Negation in Neural Machine Translation
2021cites this paper
Context-aware Self-Attention Networks for Natural Language Processing
2021cites this paper
KLSI Methods for Human Simultaneous Interpretation and Towards Building a Simultaneous Machine Translation System Reflecting the KLSI Methods
2021cites this paper
Entangled Bidirectional Encoder to Autoregressive Decoder for Sequential Recommendation
2021cites this paper
G-Transformer for Document-Level Machine Translation
2021cites this paper
Multi-Hop Transformer for Document-Level Machine Translation
2021cites this paper
Measuring and Improving Faithfulness of Attention in Neural Machine Translation
2021cites this paper
A sequence to sequence model for dialogue generation with gated mixture of topics
2021cites this paper
Modeling Across-Context Attention For Long-Tail Query Classification in E-commerce
2021cites this paper
ACL 2020 Workshop on Automatic Simultaneous Translation Challenges, Recent Advances, and Future Directions
2020cites this paper
Neural Machine Translation
2020cites this paper
An efficient model-level fusion approach for continuous affect recognition from audiovisual signals
2020cites this paper
Show, Edit and Tell: A Framework for Editing Image Captions
2020cites this paper
Accurate Structured-Text Spotting for Arithmetical Exercise Correction
2020cites this paper
Improving Context-Aware Neural Machine Translation Using Self-Attentive Sentence Embedding
2020cites this paper
A Systematic Study of Inner-Attention-Based Sentence Representations in Multilingual Neural Machine Translation
2020cites this paper
TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation
2020cites this paper
Improving Neural Machine Translation with Linear Interpolation of a Short-Path Unit
2020cites this paper
Modeling Discourse Structure for Document-level Neural Machine Translation
2020cites this paper
Analyzing the Source and Target Contributions to Predictions in Neural Machine Translation
2020cites this paper
Human Quality Evaluation of Machine-Translated Poetry
2020cites this paper
A Document-Level Neural Machine Translation Model with Dynamic Caching Guided by Theme-Rheme Information
2020cites this paper
Translation of New Named Entities from English to Chinese
2020cites this paper
Training with Adversaries to Improve Faithfulness of Attention in Neural Machine Translation
2020influential citation
A Review of Discourse-level Machine Translation
2020cites this paper
Neural Text Generation with Artificial Negative Examples
2020cites this paper
An Implementation of a System for Video Translation Using OCR
2020cites this paper
The Translation Problem
2020cites this paper
Uses of Machine Translation
2020cites this paper
Neural Translation Models
2020cites this paper
Beyond Parallel Corpora
2020cites this paper
Modeling Recurrence for Transformer
2019cites this paper
Learning to Select, Track, and Generate for Data-to-Text
2019cites this paper
Non-Autoregressive Machine Translation with Auxiliary Regularization
2019cites this paper
Neural Machine Translation With Noisy Lexical Constraints
2019cites this paper
Effectively training neural machine translation models with monolingual data
2019cites this paper
Data-to-Text Generation with Attention Recurrent Unit
2019cites this paper
Document-level Neural Machine Translation with Document Embeddings
2019cites this paper
Bidirectional Context-Aware Hierarchical Attention Network for Document Understanding
2019cites this paper
Context-Aware Self-Attention Networks
2019cites this paper
Document-level Neural Machine Translation with Inter-Sentence Attention
2019influential citation
The Encoder-Decoder Framework and Its Applications
2019cites this paper
Interrogating the Explanatory Power of Attention in Neural Machine Translation
2019cites this paper
Ensuring Readability and Data-fidelity using Head-modifier Templates in Deep Type Description Generation
2019cites this paper
Regularized Context Gates on Transformer for Machine Translation
2019influential citation
Fast derivation of neural network based document vectors with distance constraint and negative sampling
2018cites this paper
Replacement of Unknown Words Using an Attention Model in Japanese to English Neural Machine Translation
2018cites this paper
The Integration of Gated Filtering Mechanism and Deep Bi-LSTM-CRF for Chinese Semantic Role Labeling
2018cites this paper
The Sockeye Neural Machine Translation Toolkit at AMTA 2018
2018cites this paper
Sequential Context Encoding for Duplicate Removal
2018cites this paper
Better Conversations by Modeling, Filtering, and Optimizing for Coherence and Diversity
2018influential citation
Fast Derivation of Cross-lingual Document Vectors from Self-attentive Neural Machine Translation Model
2018cites this paper
Neural machine translation framework based cross-lingual document vector with distance constraint training
2018cites this paper
Incorporating Statistical Machine Translation Word Knowledge Into Neural Machine Translation
2018cites this paper
Linguistic Knowledge-Aware Neural Machine Translation
2018cites this paper
Document-Level Neural Machine Translation with Hierarchical Attention Networks
2018cites this paper
Neural Machine Translation with Decoding History Enhanced Attention
2018cites this paper
Adaptive Weighting for Neural Machine Translation
2018cites this paper
History attention for source-target alignment in neural machine translation
2018cites this paper
Fusing Recency into Neural Machine Translation with an Inter-Sentence Gate Model
2018cites this paper
Neural Language Models
2018cites this paper
Natural Answer Generation with Heterogeneous Memory
2018cites this paper
An Analysis of Source Context Dependency in Neural Machine Translation
2018influential citation
Sparse and Constrained Attention for Neural Machine Translation
2018cites this paper
Word Rewarding for Adequate Neural Machine Translation
2018cites this paper
Improving the Quality of Neural Machine Translation
2018cites this paper
Proceedings of the 21st Annual Conference of the European Association for Machine Translation
2018cites this paper
Neural Machine Translation with Dynamic Selection Network
2018cites this paper
Improving Sequence-to-Sequence Constituency Parsing
2018cites this paper
Neural Natural Language Generation with Unstructured Contextual Information
2018influential citation
Translating Pro-Drop Languages with Reconstruction Models
2018influential citation
Learning to refine source representations for neural machine translation
2018cites this paper
Accelerating NMT Batched Beam Decoding with LMBR Posteriors for Deployment
2018cites this paper
Apply Chinese Radicals Into Neural Machine Translation: Deeper Than Character Level
2018cites this paper