Despite much success has been achieved, neural machine translation (NMT) suffers from exposure bias and evaluation discrepancy. To be specific, the generation inconsistency between the training and inference process further causes error accumulation and distribution disparity. Furthermore, NMT models are generally optimized on word-level cross-entropy loss function but evaluated by sentence-level metrics. This evaluation-level mismatch may mislead the promotion of translation performance. To address these two drawbacks, we propose to challenge training to gradually simulate inference. Namely, the decoder is fed with inferred words rather than ground truth words during training with a dynamic probability. To ensure accuracy and integrity, we adopt alignment and tailoring on the inferred words. Therefore, these words can leverage inferred information to help improve the training process. As for the dynamic simulation, we define a novel loss-sensitive probability that can sense the converge of training and finetune itself in turn. Experimental results on IWSLT 2016 German-English and WMT 2019 English-Chinese datasets demonstrate that our methodology can significantly improve translation quality. The approach of alignment and tailoring outperforms previous works. Meanwhile, the proposed loss-sensitive sampling is also useful for other state-of-the-art scheduled sampling methods to achieve further promotion.
Challenge Training to Simulate Inference in Machine Translation
Wenjie Lu,Jie Zhou,Leiying Zhou,Gongshen Liu,Q. Zhang
Published 2020 in IEEE International Joint Conference on Neural Network
ABSTRACT
PUBLICATION RECORD
- Publication year
2020
- Venue
IEEE International Joint Conference on Neural Network
- Publication date
2020-07-01
- Fields of study
Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-25 of 25 references · Page 1 of 1
CITED BY
Showing 1-2 of 2 citing papers · Page 1 of 1