Challenge Training to Simulate Inference in Machine Translation

Wenjie Lu,Jie Zhou,Leiying Zhou,Gongshen Liu,Q. Zhang

Published 2020 in IEEE International Joint Conference on Neural Network

ABSTRACT

Despite much success has been achieved, neural machine translation (NMT) suffers from exposure bias and evaluation discrepancy. To be specific, the generation inconsistency between the training and inference process further causes error accumulation and distribution disparity. Furthermore, NMT models are generally optimized on word-level cross-entropy loss function but evaluated by sentence-level metrics. This evaluation-level mismatch may mislead the promotion of translation performance. To address these two drawbacks, we propose to challenge training to gradually simulate inference. Namely, the decoder is fed with inferred words rather than ground truth words during training with a dynamic probability. To ensure accuracy and integrity, we adopt alignment and tailoring on the inferred words. Therefore, these words can leverage inferred information to help improve the training process. As for the dynamic simulation, we define a novel loss-sensitive probability that can sense the converge of training and finetune itself in turn. Experimental results on IWSLT 2016 German-English and WMT 2019 English-Chinese datasets demonstrate that our methodology can significantly improve translation quality. The approach of alignment and tailoring outperforms previous works. Meanwhile, the proposed loss-sensitive sampling is also useful for other state-of-the-art scheduled sampling methods to achieve further promotion.

PUBLICATION RECORD

Publication year
2020
Venue
IEEE International Joint Conference on Neural Network
Publication date
2020-07-01
Fields of study
Computer Science
Identifiers
DOI 10.1109/IJCNN48605.2020.9206915
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Introduction to Reinforcement Learning
2020cited by this paper
The Evolved Transformer
2019cited by this paper
Bridging the Gap between Training and Inference for Neural Machine Translation
2019influential reference
GENERATIVE ADVERSARIAL NETS
2018cited by this paper
CoT: Cooperative Training for Generative Modeling of Discrete Data
2018cited by this paper
DTMT: A Novel Deep Transition Architecture for Neural Machine Translation
2018cited by this paper
Adversarial Ranking for Language Generation
2017cited by this paper
Empirical evaluation of NMT and PBSMT quality for large-scale translation production.
2017cited by this paper
Attention is All you Need
2017cited by this paper
Sequence-to-Sequence Learning as Beam-Search Optimization
2016cited by this paper
Neural Machine Translation of Rare Words with Subword Units
2015cited by this paper
Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks
2015influential reference
Minimum Risk Training for Neural Machine Translation
2015cited by this paper
A Neural Attention Model for Abstractive Sentence Summarization
2015cited by this paper
Improving Multi-Step Prediction of Learned Time Series Models
2015cited by this paper
Neural Machine Translation by Jointly Learning to Align and Translate
2014cited by this paper
Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation
2014cited by this paper
Adam: A Method for Stochastic Optimization
2014cited by this paper
Sequence to Sequence Learning with Neural Networks
2014cited by this paper
A* Sampling
2014cited by this paper
WIT3: Web Inventory of Transcribed and Translated Talks
2012cited by this paper
Curriculum learning
2009cited by this paper
ROUGE: A Package for Automatic Evaluation of Summaries
2004cited by this paper
Recursive Hetero-associative Memories for Translation
1997cited by this paper
Statistical Theory of Extreme Values and Some Practical Applications
1955cited by this paper

CITED BY

Current State-of-the-Art of Bias Detection and Mitigation in Machine Translation for African and European Languages: a Review
2024cites this paper
Fast streaming translation using machine learning with transformer
2021cites this paper