Scaling Phrase-Based Statistical Machine Translation to Larger Corpora and Longer Phrases

Chris Callison-Burch,C. Bannard,Josh Schroeder

Published 2005 in Annual Meeting of the Association for Computational Linguistics

ABSTRACT

In this paper we describe a novel data structure for phrase-based statistical machine translation which allows for the retrieval of arbitrarily long phrases while simultaneously using less memory than is required by current decoder implementations. We detail the computational complexity and average retrieval times for looking up phrase translations in our suffix array-based data structure. We show how sampling can be used to reduce the retrieval time by orders of magnitude with no loss in translation quality.

PUBLICATION RECORD

  • Publication year

    2005

  • Venue

    Annual Meeting of the Association for Computational Linguistics

  • Publication date

    2005-06-25

  • Fields of study

    Computer Science

  • Identifiers
  • External record

    Open on Semantic Scholar

  • Source metadata

    Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

CITED BY

Showing 1-100 of 107 citing papers · Page 1 of 2