Current phrase-based statistical machine translation systems process each test sentence in isolation and do not enforce global consistency constraints, even though the test data is often internally consistent with respect to topic or style. We propose a new consistency model for machine translation in the form of a graph-based semi-supervised learning algorithm that exploits similarities between training and test data and also similarities between different test sentences. The algorithm learns a regression function jointly over training and test data and uses the resulting scores to rerank translation hypotheses. Evaluation on two travel expression translation tasks demonstrates improvements of up to 2.6 BLEU points absolute and 2.8% in PER.
Graph-based Learning for Statistical Machine Translation
Andrei Alexandrescu,K. Kirchhoff
Published 2009 in North American Chapter of the Association for Computational Linguistics
ABSTRACT
PUBLICATION RECORD
- Publication year
2009
- Venue
North American Chapter of the Association for Computational Linguistics
- Publication date
2009-05-31
- Fields of study
Linguistics, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-23 of 23 references · Page 1 of 1
CITED BY
Showing 1-40 of 40 citing papers · Page 1 of 1