Transfer Learning for Low-Resource Neural Machine Translation

Barret Zoph,Deniz Yuret,Jonathan May,Kevin Knight

Published 2016 in Conference on Empirical Methods in Natural Language Processing

ABSTRACT

The encoder-decoder framework for neural machine translation (NMT) has been shown effective in large data scenarios, but is much less effective for low-resource languages. We present a transfer learning method that significantly improves Bleu scores across a range of low-resource languages. Our key idea is to first train a high-resource language pair (the parent model), then transfer some of the learned parameters to the low-resource pair (the child model) to initialize and constrain training. Using our transfer learning method we improve baseline NMT models by an average of 5.6 Bleu on four low-resource language pairs. Ensembling and unknown word replacement add another 2 Bleu which brings the NMT performance on low-resource machine translation close to a strong syntax based machine translation (SBMT) system, exceeding its performance on one language pair. Additionally, using the transfer learning model for re-scoring, we can improve the SBMT system by an average of 1.3 Bleu, improving the state-of-the-art on low-resource machine translation.

PUBLICATION RECORD

Publication year
2016
Venue
Conference on Empirical Methods in Natural Language Processing
Publication date
2016-04-01
Fields of study
Computer Science
Identifiers
DOI 10.18653/v1/D16-1163 arXiv 1604.02201
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Neural Machine Translation
2016cited by this paper
Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism
2016cited by this paper
An Empirical Exploration of Recurrent Network Architectures
2015cited by this paper
Transfer of Learning
2015cited by this paper
Transfer learning for speech and language processing
2015cited by this paper
Effective Approaches to Attention-based Neural Machine Translation
2015influential reference
Scaling recurrent neural network language models
2015cited by this paper
Multi-Task Learning for Multiple Language Translation
2015cited by this paper
Findings of the 2015 Workshop on Statistical Machine Translation
2015cited by this paper
Sequence to Sequence Learning with Neural Networks
2014cited by this paper
Recurrent Neural Network Regularization
2014influential reference
Addressing the Rare Word Problem in Neural Machine Translation
2014influential reference
Neural Machine Translation by Jointly Learning to Align and Translate
2014cited by this paper
Pragmatic Neural Language Modelling in Machine Translation
2014cited by this paper
Decoding with Large-Scale Neural Language Models Improves Translation
2013cited by this paper
Transfer learning for Latin and Chinese characters with Deep Neural Networks
2012cited by this paper
A fast and simple algorithm for training neural probabilistic language models
2012cited by this paper
Deep Learning of Representations for Unsupervised and Transfer Learning
2011cited by this paper
A Survey on Transfer Learning
2010cited by this paper
11,001 New Features for Statistical Machine Translation
2009cited by this paper
Alignment by Agreement
2006cited by this paper
Scalable Inference and Training of Context-Rich Syntactic Translation Models
2006cited by this paper
What’s in a translation rule?
2004influential reference
Learning to Forget: Continual Prediction with LSTM
2000cited by this paper
Long Short-Term Memory
1997cited by this paper
Asynchronous translations with recurrent neural nets
1997cited by this paper
A connectionist approach to machine translation
1997cited by this paper
Backpropagation Through Time: What It Does and How to Do It
1990cited by this paper

CITED BY

Mitigating Structural Noise in Low-Resource S2TT: An Optimized Cascaded Nepali-English Pipeline with Punctuation Restoration
2026cites this paper
Bridging Classic and Modern English: An NLP Approach to Translation and Educational Chatbots in English Literature
2025cites this paper
Mixup Helps Translation, But Do the Coefficients and the Selection Strategy Influence Translation Quality?
2025cites this paper
Machine Translation Strategies for Low-Resource Colombian Indigenous Languages
2025cites this paper
Toward Enhancing Cross-Lingual Domain Knowledge Sharing and Transferring for Multilingual Domain Adaptation in NMT
2025cites this paper
TFT-TL: Token-Level Filter Training Transfer Learning for Low-Resource Neural Machine Translation
2025influential citation
Using Encipherment to Isolate Conditions for the Successful Fine-tuning of Massively Multilingual Translation Models
2025cites this paper
Why should only High-Resource-Languages have all the fun? Pivot Based Evaluation in Low Resource Setting
2025cites this paper
Bringing Ladin to FLORES+
2025cites this paper
Reading When Translating: Multi-Modal Document Image Machine Translation With Reading Flow Prediction
2025cites this paper
Do LLMs Align with My Task? Evaluating Text-to-SQL via Dataset Alignment
2025cites this paper
Kannada to English Machine Translation Using Deep Neural Network
2025cites this paper
Low-Resource English–Tigrinya MT: Leveraging Multilingual Models, Custom Tokenizers, and Clean Evaluation Benchmarks
2025influential citation
On the Transferability of Causal Knowledge for Language Models
2025cites this paper
Comparing the Performance of Large Language Models and Machine Translation Systems in Thai-English Translation of Offline Data
2025cites this paper
How Much Data in Low-resource Indian Languages is "Sufficient' for Transfer Learning: A Comparative Study for POS Annotation
2025cites this paper
Optimized Fine-tuning and Pseudo-Data Strategies for Cross-Domain Low-Resource Language Cantonese-English Neural Machine Translation
2025cites this paper
Enhancing distant low-resource neural machine translation with semantic pivot
2025cites this paper
Low-Resource Language Models: Leveraging Transfer and Zero-Shot Learning for Underrepresented Languages
2025cites this paper
Transformers: Leveraging OpenNMT and Transfer Learning for Low-Resource Indian Language Translation
2025cites this paper
Leveraging the Cross-Domain & Cross-Linguistic Corpus for Low Resource NMT: A Case Study On Bhili-Hindi-English Parallel Corpus
2025cites this paper
Zero-Shot Performance Prediction for Probabilistic Scaling Laws
2025cites this paper
Improving Neural Machine Translation Through Code‐Mixed Data Augmentation
2025cites this paper
Video Translation Software for Indian Languages: Bridging Linguistic and Cultural Gaps
2025cites this paper
Low-Resource Neural Machine Translation Using Recurrent Neural Networks and Transfer Learning: A Case Study on English-to-Igbo
2025cites this paper
Efficient Chinese-Malay Speech-Text Translation via Layer-Freezing Adaptation of Multimodal Foundation Models
2025cites this paper
Cross-task EEG Signal Classification Method Based on Transfer Learning
2025cites this paper
How do datasets, developers, and models affect biases in a low-resourced language?
2025cites this paper
Limited-Resource Adapters Are Regularizers, Not Linguists
2025cites this paper
Scaling Low-Resource MT via Synthetic Data Generation with LLMs
2025cites this paper
Eliciting analogical reasoning from language models in retrieval-augmented translation under low-resource scenarios
2025cites this paper
Leveraging Large Language Models for Superior Low-Resource Neural Machine Translation
2025cites this paper
When Scripts Diverge: Strengthening Low-Resource Neural Machine Translation Through Phonetic Cross-Lingual Transfer
2025cites this paper
Towards Multi-Objective Routing: A Novel Coreset-based Transfer Learning Framework
2025cites this paper
Meranaw-English NMT: Corpus Construction and Transformer Fine-Tuning for Low-Resource Language Pair
2025cites this paper
Effectiveness of Transfer Learning Approach for Low-Resource Bangla Dialect Translation
2025cites this paper
Language Adaptation of Large Language Models: An Empirical Study on LLaMA2
2025cites this paper
Understand Layout and Translate Text: Unified Feature-Conductive End-to-End Document Image Translation
2025cites this paper
Exploration of Vocabulary Expansion and Self-Supervised Enhancement Methods for Han-Zhuang Machine Translation
2025cites this paper
The Role of Vocabularies in Learning Sparse Representations for Ranking
2025cites this paper
Spanish-Mapudungun Translation using Transfer Learning for Low-resource Languages
2025cites this paper
Neural Approaches for Indonesian–Javanese Translation: A Comparison of RNN and Transformer Models
2025cites this paper
KozKreolMRU WMT 2025 CreoleMT System Description: Koz Kreol: Multi-Stage Training for English–Mauritian Creole MT
2025cites this paper
Octopus: Towards Building the Arabic Speech LLM Suite
2025cites this paper
A Keyword Exchange-Based Data Augmentation Method for Low-Resource Neural Machine Translation
2025cites this paper
A Comprehensive Survey on Transformer-Based Machine Translation: Identifying Research Gaps and Solutions for Large Language Models
2025cites this paper
Understanding In-Context Machine Translation for Low-Resource Languages: A Case Study on Manchu
2025cites this paper
Improving Low-Resource Kazakh-English and Turkish-English Neural Machine Translation Using Transfer Learning and Part of Speech Tags
2025cites this paper
Toward Low-Resource Languages Machine Translation: A Language-Specific Fine-Tuning With LoRA for Specialized Large Language Models
2025cites this paper
Dongba Machine Translation with Transfer Learning: Leveraging Pre-trained Ancient Chinese Models
2025cites this paper
Bridging the Linguistic Divide: A Survey on Leveraging Large Language Models for Machine Translation
2025cites this paper
A data-guided curriculum towards low-resource neural machine translation
2025cites this paper
POMP: Probability-driven Meta-graph Prompter for LLMs in Low-resource Unsupervised Neural Machine Translation
2024cites this paper
Chronicling Germany: An Annotated Historical Newspaper Dataset
2024cites this paper
Low-Resource Vision Challenges for Foundation Models
2024cites this paper
Low Resource Arabic Dialects Transformer Neural Machine Translation Improvement through Incremental Transfer of Shared Linguistic Features
2024cites this paper
WDSRL: Multi-Domain Neural Machine Translation With Word-Level Domain-Sensitive Representation Learning
2024cites this paper
Enhancing Low-Resource NLP by Consistency Training With Data and Model Perturbations
2024cites this paper
Kullback–Leibler Divergence-Based Regularized Normalization for Low-Resource Tasks
2024cites this paper
The Role of Syntax and Semantics in Rule-Based Translation: A Comprehensive Review
2024cites this paper
On Block Classification for Automatic Content Extraction From Chinese Resumes
2024cites this paper
A selective model for transfer learning in CNNs: optimization of fine-tuning layers
2024cites this paper
LIMBA: An Open-Source Framework for the Preservation and Valorization of Low-Resource Languages using Generative Models
2024cites this paper
Neural Machine Translation Using a Pivot Approach: Dogri to English
2024cites this paper
A3-108 Controlling Token Generation in Low Resource Machine Translation Systems
2024cites this paper
Inductive Linguistic Reasoning with Large Language Models
2024cites this paper
Proof of Thought : Neurosymbolic Program Synthesis allows Robust and Interpretable Reasoning
2024cites this paper
A Review of Mongolian Neural Machine Translation from the Perspective of Training
2024cites this paper
LangSAMP: Language-Script Aware Multilingual Pretraining
2024cites this paper
Generative-Adversarial Networks for Low-Resource Language Data Augmentation in Machine Translation
2024influential citation
Empowering Low-Resource Language Translation: Methodologies for Bhojpuri-Hindi and Marathi-Hindi ASR and MT
2024cites this paper
Affinity-Driven Transfer Learning for Load Forecasting
2024cites this paper
Scalable Fine-tuning from Multiple Data Sources:A First-Order Approximation Approach
2024cites this paper
Development of the Russian-Tatar Translation Model Based on NLLB-200
2024cites this paper
Shaping the Future of Endangered and Low-Resource Languages---Our Role in the Age of LLMs: A Keynote at ECIR 2024
2024cites this paper
Rule-Based, Neural and LLM Back-Translation: Comparative Insights from a Variant of Ladin
2024cites this paper
FairFlow: An Automated Approach to Model-based Counterfactual Data Augmentation For NLP
2024cites this paper
Recent Advancements and Challenges of Turkic Central Asian Language Processing
2024cites this paper
Abstractive text summarization: State of the art, challenges, and improvements
2024cites this paper
Accelerating full waveform inversion by transfer learning
2024cites this paper
Irish-based Large Language Model with Extreme Low-Resource Settings in Machine Translation
2024cites this paper
Super donors and super recipients: Studying cross-lingual transfer between high-resource and low-resource languages
2024cites this paper
PreparedLLM: effective pre-pretraining framework for domain-specific large language models
2024cites this paper
Transferring Zero-shot Multilingual Chinese-Chinese Translation Model for Chinese Minority Language Translation
2024cites this paper
Improving NMT from a Low-Resource Source Language: A Use Case from Catalan to Chinese via Spanish
2024cites this paper
Entropy– and Distance-Regularized Attention Improves Low-Resource Neural Machine Translation
2024cites this paper
PragFormer: Data-Driven Parallel Source Code Classification with Transformers
2024cites this paper
Latest Research in Data Augmentation for Low Resource Language Text Translation: A Review
2024cites this paper
Unleashing the Potential of Data Science Driven Society 5.0: Applications and Implications
2024cites this paper
Visual Pivoting Unsupervised Multimodal Machine Translation in Low-Resource Distant Language Pairs
2024cites this paper
AI-Tutor: Interactive Learning of Ancient Knowledge from Low-Resource Languages
2024cites this paper
Towards Santali Linguistic Inclusion: Building the First Santali-to-English Translation Model using mT5 Transformer and Data Augmentation
2024cites this paper
Missing Melodies: AI Music Generation and its "Nearly" Complete Omission of the Global South
2024cites this paper
Opportunities and Challenges of Large Language Models for Low-Resource Languages in Humanities Research
2024cites this paper
UMBCLU at SemEval-2024 Task 1: Semantic Textual Relatedness with and without machine translation
2024cites this paper
Toucan: Many-to-Many Translation for 150 African Language Pairs
2024cites this paper
Review of Hierarchical Transfer Learning Architecture in Low-Resource Machine Translation
2024cites this paper
Semi-Supervised Spoken Language Glossification
2024cites this paper
Scaling neural machine translation to 200 languages
2024cites this paper
ProxyLM: Predicting Language Model Performance on Multilingual Tasks via Proxy Models
2024cites this paper