Multi-Task Word Alignment Triangulation for Low-Resource Languages

Published 2015 in North American Chapter of the Association for Computational Linguistics

ABSTRACT

We present a multi-task learning approach that jointly trains three word alignment models over disjoint bitexts of three languages: source, target and pivot. Our approach builds upon model triangulation, following Wang et al., which approximates a source-target model by combining source-pivot and pivot-target models. We develop a MAP-EM algorithm that uses triangulation as a prior, and show how to extend it to a multi-task setting. On a low-resource Czech-English corpus, using French as the pivot, our multi-task learning approach more than doubles the gains in both Fand Bleu scores compared to the interpolation approach of Wang et al. Further experiments reveal that the choice of pivot language does not significantly a ect performance.

PUBLICATION RECORD

Publication year
2015
Venue
North American Chapter of the Association for Computational Linguistics
Publication date
Unknown publication date
Fields of study
Linguistics, Computer Science
Identifiers
DOI 10.3115/v1/N15-1129
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Statistical Machine Translation
2014cited by this paper
Pivot-based triangulation for low-resource languages
2014cited by this paper
Ensemble Triangulation for Statistical Machine Translation
2013cited by this paper
Estimating a Dirichlet distribution
2012cited by this paper
Improving Word Alignment with Bridge Languages
2007cited by this paper
Machine Translation by Triangulation: Making Effective Use of Multi-Parallel Corpora
2007cited by this paper
Word Alignment for Languages with Scarce Resources Using Bilingual Corpora of Other Language Pairs
2006influential reference
Leveraging multiple languages to improve statistical MT word alignments
2005cited by this paper
Europarl: A Parallel Corpus for Statistical Machine Translation
2005influential reference
HMM-Based Word Alignment in Statistical Translation
1996influential reference
The Mathematics of Statistical Machine Translation: Parameter Estimation
1993cited by this paper
Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper
1977cited by this paper

CITED BY

Noisy Parallel Data Alignment
2023cites this paper
Health Care Misinformation: an Artificial Intelligence Challenge for Low-resource languages
2020cites this paper
Should All Cross-Lingual Embeddings Speak English?
2019cites this paper
Tied Multitask Learning for Neural Speech Translation
2018cites this paper
Linguistic-Relationships-Based Approach for Improving Word Alignment
2017cites this paper
Cross-lingual alignment transfer: a chicken-and-egg story?
2016cites this paper