Online Large-Margin Training of Syntactic and Structural Translation Features

Published 2008 in Conference on Empirical Methods in Natural Language Processing

ABSTRACT

Minimum-error-rate training (MERT) is a bottleneck for current development in statistical machine translation because it is limited in the number of weights it can reliably optimize. Building on the work of Watanabe et al., we explore the use of the MIRA algorithm of Crammer et al. as an alternative to MERT. We first show that by parallel processing and exploiting more of the parse forest, we can obtain results using MIRA that match or surpass MERT in terms of both translation quality and computational cost. We then test the method on two classes of features that address deficiencies in the Hiero hierarchical phrase-based model: first, we simultaneously train a large number of Marton and Resnik's soft syntactic constraints, and, second, we introduce a novel structural distortion model. In both cases we obtain significant improvements in translation performance. Optimizing them in combination, for a total of 56 feature weights, we improve performance by 2.6 Bleu on a subset of the NIST 2006 Arabic-English evaluation data.

PUBLICATION RECORD

Publication year
2008
Venue
Conference on Empirical Methods in Natural Language Processing
Publication date
2008-10-25
Fields of study
Linguistics, Computer Science
Identifiers
DOI 10.3115/1613715.1613747
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

A Tree Sequence Alignment-based Tree-to-Tree Translation Model
2008cited by this paper
Cohesive Phrase-Based Decoding for Statistical Machine Translation
2008cited by this paper
Complexity of Finding the BLEU-optimal Hypothesis in a Confusion Network
2008cited by this paper
Regularization and Search for Minimum Error Rate Training
2008cited by this paper
Soft Syntactic Constraints for Hierarchical Phrased-Based Translation
2008influential reference
A Discriminative Latent Variable Model for Statistical Machine Translation
2008cited by this paper
Randomized Language Models via Perfect Hash Functions
2008cited by this paper
Beyond Log-Linear Models: Boosted Minimum Error Rate Training for N-best Re-ranking
2008cited by this paper
Forest-Based Translation
2008cited by this paper
Word Sense Disambiguation Improves Statistical Machine Translation
2007cited by this paper
Online Large-Margin Training for Statistical Machine Translation
2007influential reference
Hierarchical Phrase-Based Translation
2007cited by this paper
Online learning methods for discriminative training of phrase based statistical machine translation
2007cited by this paper
Comparing Reordering Constraints for SMT Using Efficient BLEU Oracle Computation
2007influential reference
A Discriminative Global Training Algorithm for Statistical MT
2006cited by this paper
An End-to-End Discriminative Approach to Machine Translation
2006cited by this paper
Minimum Risk Annealing for Training Log-Linear Models
2006cited by this paper
Scalable Discriminative Learning for Natural Language Parsing and Translation
2006cited by this paper
A Hierarchical Phrase-Based Model for Statistical Machine Translation
2005influential reference
Learning structured prediction models: a large margin approach
2005cited by this paper
A Practical Minimal Perfect Hashing Method
2005cited by this paper
Online Large-Margin Training of Dependency Parsers
2005cited by this paper
Statistical Significance Tests for Machine Translation Evaluation
2004cited by this paper
Automatic Tagging of Arabic Text: From Raw Text to Base Phrase Chunks
2004cited by this paper
ORANGE: a Method for Evaluating Automatic Evaluation Metrics for Machine Translation
2004cited by this paper
Max-Margin Parsing
2004cited by this paper
Ultraconservative Online Algorithms for Multiclass Problems
2003influential reference
Statistical Phrase-Based Translation
2003cited by this paper
Online Passive-Aggressive Algorithms
2003cited by this paper
Minimum Error Rate Training in Statistical Machine Translation
2003cited by this paper
Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms
2002cited by this paper
Discriminative Training and Maximum Entropy Models for Statistical Machine Translation
2002cited by this paper
Fast Exact Inference with a Factored Model for Natural Language Parsing
2002cited by this paper
Bleu: a Method for Automatic Evaluation of Machine Translation
2002cited by this paper
Fast Training of Support Vector Machines using Sequential Minimal Optimization
2000cited by this paper
Large Margin Classification Using the Perceptron Algorithm
1998cited by this paper
A Polynomial-Time Algorithm for Statistical Machine Translation
1996cited by this paper

CITED BY

Style Transfer with Multi-iteration Preference Optimization
2024cites this paper
The Socio-Economic Impact of University in Thailand: Evidence from Chiang Mai University
2024cites this paper
Accurate Knowledge Distillation via n-best Reranking
2023cites this paper
Statistical machine translation based on weighted syntax–semantics
2020cites this paper
The Structured Weighted Violations MIRA
2020cites this paper
Data Selection with Cluster-Based Language Difference Models and Cynical Selection
2019cites this paper
A simple discriminative training method for machine translation with large-scale features
2019cites this paper
Baidu Neural Machine Translation Systems for WMT19
2019cites this paper
Improving Statistical Machine Translation Quality Using Differential Evolution
2019cites this paper
Online Bayesian max-margin subspace learning for multi-view classification and regression
2019cites this paper
Syntactically Supervised Transformers for Faster Neural Machine Translation
2019cites this paper
A unified framework and models for integrating translation memory into phrase-based statistical machine translation
2019cites this paper
Towards Building a Strong Transformer Neural Machine Translation System
2018cites this paper
Preference learning for machine translation
2018cites this paper
Top-Rank Enhanced Listwise Optimization for Statistical Machine Translation
2017influential citation
Discourse Structure in Machine Translation Evaluation
2017cites this paper
Robust Tuning Datasets for Statistical Machine Translation
2017cites this paper
Structured learning with inexact search: advances in shift-reduce CCG parsing
2017cites this paper
Optimizing Non-Decomposable Evaluation Metrics for Neural Machine Translation
2017cites this paper
A survey of domain adaptation for statistical machine translation
2017cites this paper
Maximum entropy models for sequences: scaling up from tagging to translation
2017cites this paper
UvA-DARE ( Digital Academic Repository ) Syntactic discriminative language model rerankers for statistical machine translation
2017cites this paper
Effective training and efficient decoding for statistical machine translation
2017cites this paper
A comparison of discriminative training criteria for continuous space translation models
2017influential citation
Training machine translation for human acceptability
2016cites this paper
Exploitation d'informations riches pour guider la traduction automatique statistique. (Complex Feature Guidance for Statistical Machine Translation)
2016cites this paper
How Translation Alters Sentiment
2016cites this paper
Online Bayesian Max-Margin Subspace Multi-View Learning
2016cites this paper
Listwise Ranking Functions for Statistical Machine Translation
2016cites this paper
Optimization for Statistical Machine Translation: A Survey
2016cites this paper
Improving Statistical Machine Translation Performance by Oracle-BLEU Model Re-estimation
2016influential citation
Topic-based term translation models for statistical machine translation
2016cites this paper
Online Learning for Statistical Machine Translation
2016cites this paper
Adaptation of Statistical Machine Translation with Sparse Features
2016cites this paper
Comparative Study on Hierarchical Phrase Structures and Linguistic Phrase Structures
2016cites this paper
Different Contributions to Cost-Effective Transcription and Translation of Video Lectures.
2016cites this paper
Latent Structure Discriminative Learning for Natural Language Processing
2016cites this paper
An Empirical Evaluation of Noise Contrastive Estimation for the Neural Network Joint Model of Translation
2016cites this paper
Modèles exponentiels et contraintes sur les espaces de recherche en traduction automatique et pour le transfert cross-lingue. (Log-linear Models and Search Space Constraints in Statistical Machine Translation and Cross-lingual Transfer)
2016cites this paper
Sequential Decisions and Predictions in Natural Language Processing
2016cites this paper
Aligning the foundations of hierarchical statistical machine translation
2016influential citation
APRO: All-Pairs Ranking Optimization for MT Tuning
2015cites this paper
Coactive Learning for Interactive Machine Translation
2015cites this paper
Sentiment after Translation: A Case-Study on Arabic Social Media Posts
2015cites this paper
Analyzing Optimization for Statistical Machine Translation: MERT Learns Verbosity, PRO Learns Length
2015cites this paper
Model Invertibility Regularization: Sequence Alignment With or Without Parallel Data
2015cites this paper
Drem: The AFRL Submission to the WMT15 Tuning Task
2015influential citation
MT Tuning on RED: A Dependency-Based Evaluation Metric
2015cites this paper
Spectral Probablistic Modeling and Applications to Natural Language Processing
2015cites this paper
Name-aware language model adaptation and sparse features for statistical machine translation
2015cites this paper
Une méthode discriminant formation simple pour la traduction automatique avec Grands Caractéristiques
2015cites this paper
Neural Network-based Multilingual Translation Models
2015cites this paper
Surveys: A Survey of Word Reordering in Statistical Machine Translation: Computational Models and Language Phenomena
2015cites this paper
Bag-of-Words Forced Decoding for Cross-Lingual Information Retrieval
2015cites this paper
Joshua 6: A phrase-based and hierarchical statistical machine translation system
2015cites this paper
Learning to Rank Algorithms and Their Application in Machine Translation
2015influential citation
Class-based N-gram language difference models for data selection
2015cites this paper
Improving Semantic Parsing with Enriched Synchronous Context-Free Grammar
2015cites this paper
Sequence Alignment With or Without Parallel Data
2015cites this paper
10-2012 pycdec : A Python Interface to cdec
2015cites this paper
Latent structure discriminative learning for natural language processing
2015cites this paper
A Coactive Learning View of Online Structured Prediction in Statistical Machine Translation
2015cites this paper
A Comparison of Update Strategies for Large-Scale Maximum Expected BLEU Training
2015cites this paper
Quantifying Cross-lingual Semantic Similarity for Natural Language Processing Applications
2015cites this paper
ListNet-based MT Rescoring
2015cites this paper
Data Selection With Fewer Words
2015cites this paper
A pilot study towards end-to-end MT training
2015cites this paper
Online multiclass learning with "bandit" feedback under a Passive-Aggressive approach
2015cites this paper
The Geometry of Statistical Machine Translation
2015cites this paper
Dynamic Topic Adaptation for Improved Contextual Modelling in Statistical Machine Translation
2015cites this paper
Lattice Desegmentation for Statistical Machine Translation
2014influential citation
Translation-based ranking in cross-language information retrieval
2014influential citation
Topic-Based Dissimilarity and Sensitivity Models for Translation Rule Selection
2014cites this paper
DiscoTK: Using Discourse Structure for Machine Translation Evaluation
2014cites this paper
Using Discourse Structure Improves Machine Translation Evaluation
2014cites this paper
A Unified Model for Soft Linguistic Reordering Constraints in Statistical Machine Translation
2014influential citation
Recursive neural network based word topology model for hierarchical phrase-based speech translation
2014cites this paper
Sampling Tree Fragments from Forests
2014cites this paper
Learning Translational and Knowledge-based Similarities from Relevance Rankings for Cross-Language Retrieval
2014cites this paper
Search-Aware Tuning for Machine Translation
2014cites this paper
EXACT SAMPLING AND OPTIMISATION IN STATISTICAL MACHINE TRANSLATION
2014cites this paper
Automatic Post-Editing Method Using Translation Knowledge Based on Intuitive Common Parts Continuum for Statistical Machine Translation
2014cites this paper
Maximum-entropy word alignment and posterior-based phrase extraction for machine translation
2014cites this paper
Hierarchical MT Training using Max-Violation Perceptron
2014cites this paper
Online discriminative learning for machine translation with binary-valued feedback
2014cites this paper
Head of Delegation
2014cites this paper
Using Comparable Corpora to Augment Statistical Machine Translation Models in Low Resource Settings
2014cites this paper
Algebraic decoder specification: coupling formal-language theory and statistical machine translation
2014cites this paper
Online adaptation to post-edits for phrase-based statistical machine translation
2014cites this paper
Online Learning in Tensor Space
2014cites this paper
Learning to translate queries for CLIR
2014cites this paper
Locally Non-Linear Learning for Statistical Machine Translation via Discretization and Structured Regularization
2014influential citation
Discriminative Training for Log-Linear Based SMT: Global or Local Methods
2014influential citation
Classification and Ranking Approaches to Discriminative Language Modeling for ASR
2013cites this paper
Fast and Adaptive Online Training of Feature-Rich Translation Models
2013cites this paper
Spectral Probabilistic Modeling and Applications to Natural Language Processing
2013cites this paper
Online Distributed Passive-Aggressive Algorithm for Structured Learning
2013cites this paper
Integrating morpho-syntactic features in English-Arabic statistical machine translation
2013cites this paper
Joshua 5.0: Sparser, Better, Faster, Server
2013cites this paper
Tuning SMT with a Large Number of Features via Online Feature Grouping
2013cites this paper