Monolingual Marginal Matching for Translation Model Adaptation

Ann Irvine,Chris Quirk,Hal Daumé

Published 2013 in Conference on Empirical Methods in Natural Language Processing

ABSTRACT

When using a machine translation (MT) model trained on OLD-domain parallel data to translate NEW-domain text, one major challenge is the large number of out-of-vocabulary (OOV) and new-translation-sense words. We present a method to identify new translations of both known and unknown source language words that uses NEW-domain comparable document pairs. Starting with a joint distribution of source-target word pairs derived from the OLD-domain parallel corpus, our method recovers a new joint distribution that matches the marginal distributions of the NEW-domain comparable document pairs, while minimizing the divergence from the OLD-domain distribution. Adding learned translations to our French-English MT model results in gains of about 2 BLEU points over strong baselines.

PUBLICATION RECORD

  • Publication year

    2013

  • Venue

    Conference on Empirical Methods in Natural Language Processing

  • Publication date

    2013-10-01

  • Fields of study

    Computer Science

  • Identifiers
  • External record

    Open on Semantic Scholar

  • Source metadata

    Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-38 of 38 references · Page 1 of 1

CITED BY

Showing 1-18 of 18 citing papers · Page 1 of 1