We propose the use of pre-trained embeddings as features of a regression model for sentence-level quality estimation of machine translation. In our work we combine freely available BERT and LASER multilingual embeddings to train a neural-based regression model. In the second proposed method we use as an input features not only pre-trained embeddings, but also log probability of any machine translation (MT) system. Both methods are applied to several language pairs and are evaluated both as a classical quality estimation system (predicting the HTER score) as well as an MT metric (predicting human judgements of translation quality).
Quality Estimation and Translation Metrics via Pre-trained Word and Sentence Embeddings
E. Yankovskaya,Andre Tättar,Mark Fishel
Published 2019 in Conference on Machine Translation
ABSTRACT
PUBLICATION RECORD
- Publication year
2019
- Venue
Conference on Machine Translation
- Publication date
2019-08-01
- Fields of study
Linguistics, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-24 of 24 references · Page 1 of 1
CITED BY
Showing 1-25 of 25 citing papers · Page 1 of 1