Agreement-Based Joint Training for Bidirectional Attention-Based Neural Machine Translation

Yong Cheng,Shiqi Shen,Zhongjun He,W. He,Hua Wu,Maosong Sun,Yang Liu

Published 2015 in International Joint Conference on Artificial Intelligence

ABSTRACT

The attentional mechanism has proven to be effective in improving end-to-end neural machine translation. However, due to the intricate structural divergence between natural languages, unidirectional attention-based models might only capture partial aspects of attentional regularities. We propose agreement-based joint training for bidirectional attention-based end-to-end neural machine translation. Instead of training source-to-target and target-to-source translation models independently, our approach encourages the two complementary models to agree on word alignment matrices on the same training data. Experiments on Chinese-English and English-French translation tasks show that agreement-based joint training significantly improves both alignment and translation quality over independent training.

PUBLICATION RECORD

Publication year
2015
Venue
International Joint Conference on Artificial Intelligence
Publication date
2015-12-15
Fields of study
Computer Science
Identifiers
DOI 10.1007/978-981-32-9748-7_2 arXiv 1512.04650
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Generalized Agreement for Bidirectional Word Alignment
2015cited by this paper
Effective Approaches to Attention-based Neural Machine Translation
2015cited by this paper
Model Invertibility Regularization: Sequence Alignment With or Without Parallel Data
2015cited by this paper
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
2015cited by this paper
On Using Very Large Target Vocabulary for Neural Machine Translation
2014cited by this paper
Sequence to Sequence Learning with Neural Networks
2014cited by this paper
Contrastive Unsupervised Word Alignment with Non-Local Features
2014cited by this paper
Neural Machine Translation by Jointly Learning to Align and Translate
2014cited by this paper
Addressing the Rare Word Problem in Neural Machine Translation
2014cited by this paper
Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation
2014cited by this paper
Recurrent Continuous Translation Models
2013influential reference
Posterior Regularization for Structured Latent Variable Models
2010cited by this paper
Agreement-Based Learning
2007influential reference
Factored Translation Models
2007cited by this paper
Alignment by Agreement
2006influential reference
A Hierarchical Phrase-Based Model for Statistical Machine Translation
2005cited by this paper
Statistical Significance Tests for Machine Translation Evaluation
2004cited by this paper
Statistical Phrase-Based Translation
2003cited by this paper
SRILM - an extensible language modeling toolkit
2002cited by this paper
The Mathematics of Statistical Machine Translation: Parameter Estimation
1993cited by this paper

CITED BY

Advancements in Machine Translation and Cross-Language Computational Applications: Techniques, Challenges, and Future Directions
2025cites this paper
The Ethical Paradox of AI-Generated Texts: Investigating the Moral Responsibility in Generative Models
2025influential citation
An Effective Language Convention Model Based on Deep Structured Learning and Natural Language Processing for Higher Education
2023cites this paper
Visual Topic Semantic Enhanced Machine Translation for Multi-Modal Data Efficiency
2023cites this paper
Improving neural machine translation with POS-tag features for low-resource language pairs
2022cites this paper
Attention Analysis and Calibration for Transformer in Natural Language Generation
2022cites this paper
Revamping Multilingual Agreement Bidirectionally via Switched Back-translation for Multilingual Neural Machine Translation
2022cites this paper
Word Alignment in the Era of Deep Learning: A Tutorial
2022cites this paper
Retracted March 6, 2026: Study on Machine Translation Teaching Model Based on Translation Parallel Corpus and Exploitation for Multimedia Asian Information Processing
2022cites this paper
Partial Attention Modeling for Sentiment Analysis of Big Data
2022cites this paper
Towards Multilingual Transitivity and Bidirectional Multilingual Agreement for Multilingual Document-level Machine Translation
2022cites this paper
E-Learning Performance Prediction Based on Attention Mechanism
2021cites this paper
D3-SACNN: DGA Domain Detection with Self-Attention Convolutional Network
2021cites this paper
Soft Spatial Attention-Based Multimodal Driver Action Recognition Using Deep Learning
2021cites this paper
Attention mechanism based LSTM in classification of stressed speech under workload
2021cites this paper
Neural machine translation: past, present, and future
2021cites this paper
Progress in Machine Translation
2021cites this paper
Improved Aspect-level Sentiment Analysis Method based on Multi-Head Attention Mechanism
2021cites this paper
Multilingual Agreement for Multilingual Neural Machine Translation
2021cites this paper
On the diversity of multi-head attention
2021cites this paper
Duplex Sequence-to-Sequence Learning for Reversible Machine Translation
2021cites this paper
TAERT: Triple-Attentional Explainable Recommendation with Temporal Convolutional Network
2021cites this paper
Modeling Coverage for Non-Autoregressive Neural Machine Translation
2021cites this paper
Neural Baselines for Word Alignment
2020cites this paper
Computer Technology Applied to Machine Translation and Speech Recognition
2020cites this paper
Two-Way Neural Machine Translation: A Proof of Concept for Bidirectional Translation Modeling Using a Two-Dimensional Grid
2020cites this paper
The Roles of Language Models and Hierarchical Models in Neural Sequence-to-Sequence Prediction
2020cites this paper
Neural Machine Translation with Target-Attention Model
2020cites this paper
A Survey of Deep Learning Techniques for Neural Machine Translation
2020cites this paper
Dual Attention on Pyramid Feature Maps for Image Captioning
2020cites this paper
Unsupervised dialectal neural machine translation
2020cites this paper
Accurate Word Alignment Induction from Neural Machine Translation
2020cites this paper
Capturing Contextual Factors in Sentiment Classification: An Ensemble Approach
2020cites this paper
A review of Thai–English machine translation
2020cites this paper
Generative latent neural models for automatic word alignment
2020cites this paper
Visual Agreement Regularized Training for Multi-Modal Machine Translation
2019cites this paper
SC-NER: A Sequence-to-Sequence Model with Sentence Classification for Named Entity Recognition
2019cites this paper
POS Tag-enhanced Coarse-to-fine Attention for Neural Machine Translation
2019cites this paper
Pitch Classification Based on Bidirectional LSTM with Probabilistic Attention for Speech Segregation from Speech-Music Mixtures
2019cites this paper
Shared-Private Bilingual Word Embeddings for Neural Machine Translation
2019cites this paper
Neural Machine Translation
2019cites this paper
A neural image captioning model with caption-to-images semantic constructor
2019cites this paper
JU-Saarland Submission to the WMT2019 English–Gujarati Translation Shared Task
2019influential citation
Mining Semantic Information in Rumor Detection via a Deep Visual Perception Based Recurrent Neural Networks
2019cites this paper
Joint Modeling for Bidirectional Neural Machine Translation with Contrastive Learning
2019cites this paper
Improving Bidirectional Decoding with Dynamic Target Semantics in Neural Machine Translation
2019cites this paper
Neural Machine Translation: A Review
2019cites this paper
A Multi-task Learning Approach for Improving Product Title Compression with User Search Log Data
2018cites this paper
Language-Independent Representor for Neural Machine Translation
2018cites this paper
Multi-Head Attention with Disagreement Regularization
2018cites this paper
Variational Recurrent Neural Machine Translation
2018cites this paper
Recurrent attention network using spatial-temporal relations for action recognition
2018cites this paper
Improve Mongolian-Chinese translation by Introducing SMT Information into NMT
2018cites this paper
An Operation Sequence Model for Explainable Neural Machine Translation
2018cites this paper
Model-Level Dual Learning
2018cites this paper
Interactive Visualization and Manipulation of Attention-based Neural Machine Translation
2017cites this paper
Cost-Aware Learning Rate for Neural Machine Translation
2017cites this paper
Neural Machine Translation with Supervised Attention
2016influential citation
Tree-to-Sequence Attentional Neural Machine Translation
2016cites this paper
Variational Neural Machine Translation
2016cites this paper
Bridging Neural Machine Translation and Bilingual Dictionaries
2016influential citation
Neural Machine Translation
2016influential citation
Dual Learning for Machine Translation
2016cites this paper
Modeling Coverage for Neural Machine Translation
2016cites this paper
Incorporating Structural Alignment Biases into an Attentional Neural Translation Model
2016cites this paper
Supervised Attentions for Neural Machine Translation
2016cites this paper
Agreement on Target-bidirectional Neural Machine Translation
2016cites this paper
Exploiting Source-side Monolingual Data in Neural Machine Translation
2016cites this paper
System Description of bjtu_nlp Neural Machine Translation System
2016cites this paper
Coverage-based Neural Machine Translation
2016cites this paper
Recurrent Neural Machine Translation
2016cites this paper
Neural Machine Translation with Reconstruction
2016influential citation
Neural Machine Translation with Pivot Languages
2016cites this paper
A Simple, Fast Diverse Decoding Algorithm for Neural Generation
2016cites this paper