Findings of the WMT 2018 Shared Task on Parallel Corpus Filtering

Philipp Koehn,Huda Khayrallah,Kenneth Heafield,M. Forcada

Published 2018 in Conference on Machine Translation

ABSTRACT

We posed the shared task of assigning sentence-level quality scores for a very noisy corpus of sentence pairs crawled from the web, with the goal of sub-selecting 1% and 10% of high-quality data to be used to train machine translation systems. Seventeen participants from companies, national research labs, and universities participated in this task.

PUBLICATION RECORD

  • Publication year

    2018

  • Venue

    Conference on Machine Translation

  • Publication date

    2018-10-31

  • Fields of study

    Linguistics, Computer Science

  • Identifiers
  • External record

    Open on Semantic Scholar

  • Source metadata

    Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-39 of 39 references · Page 1 of 1

CITED BY

Showing 1-100 of 120 citing papers · Page 1 of 2