This paper studies three techniques that improve the quality of N-best hypotheses through additional regeneration process. Unlike the multi-system consensus approach where multiple translation systems are used, our improvement is achieved through the expansion of the N-best hypotheses from a single system. We explore three different methods to implement the regeneration process: redecoding, n-gram expansion, and confusion network-based regeneration. Experiments on Chinese-to-English NIST and IWSLT tasks show that all three methods obtain consistent improvements. Moreover, the combination of the three strategies achieves further improvements and outperforms the baseline by 0.81 BLEU-score on IWSLT'06, 0.57 on NIST'03, 0.61 on NIST'05 test set respectively.
Regenerating Hypotheses for Statistical Machine Translation
Boxing Chen,Min Zhang,AiTi Aw,Haizhou Li
Published 2008 in International Conference on Computational Linguistics
ABSTRACT
PUBLICATION RECORD
- Publication year
2008
- Venue
International Conference on Computational Linguistics
- Publication date
2008-08-18
- Fields of study
Linguistics, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-20 of 20 references · Page 1 of 1
CITED BY
Showing 1-9 of 9 citing papers · Page 1 of 1