Findings of the WMT 2019 Shared Task on Parallel Corpus Filtering for Low-Resource Conditions

Philipp Koehn,Francisco (Paco) Guzmán,Vishrav Chaudhary,J. Pino

Published 2019 in Conference on Machine Translation

ABSTRACT

Following the WMT 2018 Shared Task on Parallel Corpus Filtering, we posed the challenge of assigning sentence-level quality scores for very noisy corpora of sentence pairs crawled from the web, with the goal of sub-selecting 2% and 10% of the highest-quality data to be used to train machine translation systems. This year, the task tackled the low resource condition of Nepali-English and Sinhala-English. Eleven participants from companies, national research labs, and universities participated in this task.

PUBLICATION RECORD

  • Publication year

    2019

  • Venue

    Conference on Machine Translation

  • Publication date

    Unknown publication date

  • Fields of study

    Linguistics, Computer Science

  • Identifiers
  • External record

    Open on Semantic Scholar

  • Source metadata

    Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-54 of 54 references · Page 1 of 1

CITED BY

Showing 1-83 of 83 citing papers · Page 1 of 1