Results of the WMT18 Metrics Shared Task: Both characters and embeddings achieve good performance

Qingsong Ma,Ondrej Bojar,Yvette Graham

Published 2018 in Conference on Machine Translation

ABSTRACT

This paper presents the results of the WMT18 Metrics Shared Task. We asked participants of this task to score the outputs of the MT systems involved in the WMT18 News Translation Task with automatic metrics. We collected scores of 10 metrics and 8 research groups. In addition to that, we computed scores of 8 standard metrics (BLEU, SentBLEU, chrF, NIST, WER, PER, TER and CDER) as baselines. The collected scores were evaluated in terms of system-level correlation (how well each metric’s scores correlate with WMT18 official manual ranking of systems) and in terms of segment-level correlation (how often a metric agrees with humans in judging the quality of a particular sentence relative to alternate outputs). This year, we employ a single kind of manual evaluation: direct assessment (DA).

PUBLICATION RECORD

  • Publication year

    2018

  • Venue

    Conference on Machine Translation

  • Publication date

    2018-10-01

  • Fields of study

    Linguistics, Computer Science

  • Identifiers
  • External record

    Open on Semantic Scholar

  • Source metadata

    Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-33 of 33 references · Page 1 of 1

CITED BY

Showing 1-100 of 105 citing papers · Page 1 of 2