ORANGE: a Method for Evaluating Automatic Evaluation Metrics for Machine Translation

Chin-Yew Lin,F. Och

Published 2004 in International Conference on Computational Linguistics

ABSTRACT

Comparisons of automatic evaluation metrics for machine translation are usually conducted on corpus level using correlation statistics such as Pearson's product moment correlation coefficient or Spearman's rank order correlation coefficient between human scores and automatic scores. However, such comparisons rely on human judgments of translation qualities such as adequacy and fluency. Unfortunately, these judgments are often inconsistent and very expensive to acquire. In this paper, we introduce a new evaluation method, Orange, for evaluating automatic machine translation evaluation metrics automatically without extra human involvement other than using a set of reference translations. We also show the results of comparing several existing automatic metrics and three new automatic metrics using Orange.

PUBLICATION RECORD

  • Publication year

    2004

  • Venue

    International Conference on Computational Linguistics

  • Publication date

    2004-08-23

  • Fields of study

    Computer Science

  • Identifiers
  • External record

    Open on Semantic Scholar

  • Source metadata

    Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

CITED BY

Showing 1-100 of 422 citing papers · Page 1 of 5