We demonstrate that “hallucinating” phrasal translations can significantly improve the quality of machine translation in low resource conditions. Our hallucinated phrase tables consist of entries composed from multiple unigram translations drawn from the baseline phrase table and from translations that are induced from monolingual corpora. The hallucinated phrase table is very noisy. Its translations are low precision but high recall. We counter this by introducing 30 new feature functions (including a variety of monolinguallyestimated features) and by aggressively pruning the phrase table. Our analysis evaluates the intrinsic quality of our hallucinated phrase pairs as well as their impact in end-to-end Spanish-English and Hindi-English MT.
Hallucinating Phrase Translations for Low Resource MT
Ann Irvine,Chris Callison-Burch
Published 2014 in Conference on Computational Natural Language Learning
ABSTRACT
PUBLICATION RECORD
- Publication year
2014
- Venue
Conference on Computational Natural Language Learning
- Publication date
2014-06-01
- Fields of study
Linguistics, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-31 of 31 references · Page 1 of 1
CITED BY
Showing 1-25 of 25 citing papers · Page 1 of 1