The past decade has witnessed exciting work in the field of Statistical Machine Translation (SMT). However, accurate evaluation of its potential in real-life contexts is still a questionable issue.In this study, we investigate the behavior of an SMT engine faced with a corpus far different from the one it has been trained on. We show that terminological databases are obvious resources that should be used to boost the performance of a statistical engine. We propose and evaluate a way of integrating terminology into a SMT engine which yields a significant reduction in word error rate.
Improving a general-purpose Statistical Translation Engine by Terminological lexicons
Published 2002 in International Conference on Computational Linguistics
ABSTRACT
PUBLICATION RECORD
- Publication year
2002
- Venue
International Conference on Computational Linguistics
- Publication date
2002-08-31
- Fields of study
Linguistics, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-17 of 17 references · Page 1 of 1
CITED BY
Showing 1-28 of 28 citing papers · Page 1 of 1