Shy-hunyuan-MT at WMT25 General Machine Translation Shared Task

Mao Zheng,Zheng Li,Yang Du,Bingxin Qu,Mingyang Song

Published 2025 in Proceedings of the Tenth Conference on Machine Translation

ABSTRACT

In this paper, we present our submission to the WMT25 shared task on machine translation, for which we propose S ynergy-en h anced polic y optimization framework, named Shy . This novel two-phase training framework synergistically combines knowledge distillation and fusion via reinforcement learning. In the first phase, we introduce a multi-stage training framework that harnesses the complementary strengths of multiple state-of-the-art large language models to generate diverse, high-quality translation candidates. These candidates serve as pseudo-references to guide the supervised fine-tuning of our model, Hunyuan-7B, effectively distilling the collective knowledge of multiple expert systems into a single efficient model. In the second phase, we further refine the distilled model through Group Relative Policy Optimization, a reinforcement learning technique that employs a composite reward function. By calculating reward from multiple perspectives, our model ensures better alignment with human preferences and evaluation metrics. Extensive experiments across multiple language pairs demonstrate that our model Shy-hunyuan-MT yields substantial improvements in translation quality compared to baselines. Notably, our framework achieves competitive performance comparable to that of state-of-the-art systems while maintaining computational efficiency through knowledge distillation and fusion.

PUBLICATION RECORD

Publication year
2025
Venue
Proceedings of the Tenth Conference on Machine Translation
Publication date
Unknown publication date
Fields of study
Not labeled
Identifiers
DOI 10.18653/v1/2025.wmt-1.36
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

SSR-Zero: Simple Self-Rewarding Reinforcement Learning for Machine Translation
2025cited by this paper
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
2024influential reference
Results of WMT22 Metrics Shared Task: Stop Using BLEU – Neural Metrics Are Better and More Robust
2022cited by this paper
Back-translation for Large-Scale Multilingual Machine Translation
2021cited by this paper
The University of Edinburgh’s English-German and English-Hausa Submissions to the WMT21 News Translation Task
2021cited by this paper
A Few More Examples May Be Worth Billions of Parameters
2021cited by this paper
Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks
2020cited by this paper
REALM: Retrieval-Augmented Language Model Pre-Training
2020cited by this paper
MAD-X: An Adapter-based Framework for Multi-task Cross-lingual Transfer
2020cited by this paper
Language Models are Few-Shot Learners
2020cited by this paper
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
2019cited by this paper
Attention is All you Need
2017cited by this paper
Findings of the WMT 2016 Bilingual Document Alignment Shared Task
2016cited by this paper
Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models
2016cited by this paper
Pointer Sentinel Mixture Models
2016cited by this paper
The United Nations Parallel Corpus v1.0
2016cited by this paper
Parallel Data, Tools and Interfaces in OPUS
2012cited by this paper
Curriculum learning
2009cited by this paper

CITED BY

Fine-Grained Evaluation of English-Russian MT in 2025: Linguistic Challenges Mirroring Human Translator Training
2025cites this paper