DeepSubQE: Quality estimation for subtitle translations

Published 2020 in arXiv.org

ABSTRACT

Quality estimation (QE) for tasks involving language data is hard owing to numerous aspects of natural language like variations in paraphrasing, style, grammar, etc. There can be multiple answers with varying levels of acceptability depending on the application at hand. In this work, we look at estimating quality of translations for video subtitles. We show how existing QE methods are inadequate and propose our method DeepSubQE as a system to estimate quality of translation given subtitles data for a pair of languages. We rely on various data augmentation strategies for automated labelling and synthesis for training. We create a hybrid network which learns semantic and syntactic features of bilingual data and compare it with only-LSTM and only-CNN networks. Our proposed network outperforms them by significant margin.

PUBLICATION RECORD

Publication year
2020
Venue
arXiv.org
Publication date
2020-04-22
Fields of study
Mathematics, Linguistics, Computer Science
Identifiers
arXiv 2004.13828
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Unsupervised Quality Estimation Without Reference Corpus for Subtitle Machine Translation Using Word Embeddings
2019cited by this paper
Problems with automating translation of movie/TV show subtitles
2019cited by this paper
OpenKiwi: An Open Source Framework for Quality Estimation
2019cited by this paper
Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond
2018cited by this paper
Multilingual Short Text Classification via Convolutional Neural Network
2018cited by this paper
A Constrained Deep Neural Network for Ordinal Regression
2018cited by this paper
Predictor-Estimator using Multilevel Task Learning with Stack Propagation for Neural Quality Estimation
2017cited by this paper
Sockeye: A Toolkit for Neural Machine Translation
2017cited by this paper
Word Translation Without Parallel Data
2017cited by this paper
Enriching Word Vectors with Subword Information
2016cited by this paper
Learning semantic representations using convolutional neural networks for web search
2014cited by this paper
Adam: A Method for Stochastic Optimization
2014cited by this paper
Learning deep structured semantic models for web search using clickthrough data
2013cited by this paper
Parallel Data, Tools and Interfaces in OPUS
2012cited by this paper
Estimating the Sentence-Level Quality of Machine Translation Systems
2009cited by this paper
Visualizing Data using t-SNE
2008cited by this paper
A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION
2005cited by this paper
2005 Special Issue: Framewise phoneme classification with bidirectional LSTM and other neural network architectures
2005cited by this paper
METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments
2005cited by this paper
Confidence Estimation for Machine Translation
2004cited by this paper
Bleu: a Method for Automatic Evaluation of Machine Translation
2002cited by this paper
Motivations
2000cited by this paper
The Mathematics of Statistical Machine Translation: Parameter Estimation
1993cited by this paper

CITED BY

Detecting over/under-translation errors for determining adequacy in human translations
2021cites this paper