Variational Decoding for Statistical Machine Translation

Published 2009 in Annual Meeting of the Association for Computational Linguistics

ABSTRACT

Statistical models in machine translation exhibit spurious ambiguity. That is, the probability of an output string is split among many distinct derivations (e.g., trees or segmentations). In principle, the goodness of a string is measured by the total probability of its many derivations. However, finding the best string (e.g., during decoding) is then computationally intractable. Therefore, most systems use a simple Viterbi approximation that measures the goodness of a string using only its most probable derivation. Instead, we develop a variational approximation, which considers all the derivations but still allows tractable decoding. Our particular variational distributions are parameterized as n-gram models. We also analytically show that interpolating these n-gram models for different n is similar to minimum-risk decoding for BLEU (Tromble et al., 2008). Experiments show that our approach improves the state of the art.

PUBLICATION RECORD

Publication year
2009
Venue
Annual Meeting of the Association for Computational Linguistics
Publication date
2009-08-02
Fields of study
Linguistics, Computer Science
Identifiers
DOI 10.3115/1690219.1690229
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Pattern Recognition And Machine Learning
2016cited by this paper
Fast Consensus Decoding over Translation Forests
2009cited by this paper
Joshua: An Open Source Toolkit for Parsing-Based Machine Translation
2009cited by this paper
A Scalable Decoder for Parsing-Based Machine Translation with Equivalent Language Model State Maintenance
2008cited by this paper
Lattice Minimum Bayes-Risk Decoding for Statistical Machine Translation
2008influential reference
Efficient Multi-Pass Decoding for Synchronous Context Free Grammars
2008cited by this paper
A Discriminative Latent Variable Model for Statistical Machine Translation
2008cited by this paper
Forest Rescoring: Faster Decoding with Integrated Language Models
2007cited by this paper
Hierarchical Phrase-Based Translation with Suffix Arrays
2007cited by this paper
Submission to ICGI-2000 Computational complexity of problems on probabilistic grammars and transducers
2007cited by this paper
Hierarchical Phrase-Based Translation
2007influential reference
N-Gram Posterior Probabilities for Statistical Machine Translation
2006cited by this paper
A Better N-Best List: Practical Determinization of Weighted Finite Tree Automata
2006cited by this paper
Pattern Recognition and Machine Learning
2006cited by this paper
Minimum Risk Annealing for Training Log-Linear Models
2006cited by this paper
Divergence measures and message passing
2005cited by this paper
Probabilistic CFG with Latent Annotations
2005influential reference
A General Technique to Train Language Models on Language Models
2005cited by this paper
Minimum Bayes-Risk Decoding for Statistical Machine Translation
2004cited by this paper
Statistical Phrase-Based Translation
2003cited by this paper
Minimum Error Rate Training in Statistical Machine Translation
2003cited by this paper
Generalized Algorithms for Constructing Statistical Language Models
2003cited by this paper
SRILM - an extensible language modeling toolkit
2002cited by this paper
Parameter Estimation for Probabilistic Finite-State Transducers
2002cited by this paper
Bleu: a Method for Automatic Evaluation of Machine Translation
2002influential reference
Whole-sentence exponential language models: a vehicle for linguistic-statistical integration
2001cited by this paper
Improved Statistical Alignment Models
2000cited by this paper
An Introduction to Variational Methods for Graphical Models
1999cited by this paper
An Empirical Study of Smoothing Techniques for Language Modeling
1996cited by this paper
Computational Complexity of Probabilistic Disambiguation by means of Tree-Grammars
1996cited by this paper
Efficient Algorithms for Parsing the DOP Model
1996influential reference
Computational Complexity of Probabilistic Disambiguation by means of Tree-Grammars
1996cited by this paper

CITED BY

A Joint Learning Model with Variational Interaction for Multilingual Program Translation
2024cites this paper
Variational Neural Machine Translation
2016cites this paper
Aligning the foundations of hierarchical statistical machine translation
2016cites this paper
Weighting Finite-State Transductions With Neural Context
2016cites this paper
Penalized Expectation Propagation for Graphical Models over Strings
2015cites this paper
A distributed inflection model for translating into morphologically rich languages
2015cites this paper
Exploiting Representations from Statistical Machine Translation for Cross-Language Information Retrieval
2014cites this paper
Head of Delegation
2014cites this paper
EXACT SAMPLING AND OPTIMISATION IN STATISTICAL MACHINE TRANSLATION
2014cites this paper
Algebraic decoder specification: coupling formal-language theory and statistical machine translation
2014influential citation
Tree parsing for tree-adjoining machine translation
2014cites this paper
Hierarchical Alignment Decomposition Labels for Hiero Grammar Rules
2013influential citation
LEVERAGING DIVERSE SOURCES IN STATISTICAL MACHINE TRANSLATION
2013cites this paper
A Systematic Exploration of Diversity in Machine Translation
2013cites this paper
Max-Margin Synchronous Grammar Induction for Machine Translation
2013cites this paper
Searching to Translate and Translating to Search: When Information Retrieval Meets Machine Translation
2013cites this paper
Computing the Most Probable String with a Probabilistic Finite State Machine
2013cites this paper
Improved Online Learning and Modeling for Feature-Rich Discriminative Machine Translation
2013cites this paper
Improvements in hierarchical phrase-based Statistical Machine Translation
2013cites this paper
Variational Inference for Structured NLP Models
2013cites this paper
Combining Statistical Translation Techniques for Cross-Language Information Retrieval
2012cites this paper
Improving NLP through Marginalization of Hidden Syntactic Structure
2012cites this paper
Deciding the Twins Property for Weighted Tree Automata over Extremal Semifields
2012cites this paper
Unsupervised Discriminative Induction of Synchronous Grammar for Machine Translation
2012cites this paper
Discriminative Feature-Rich Modeling for Syntax-Based Machine Translation
2012cites this paper
Cardinality pruning and language model heuristics for hierarchical phrase-based translation
2012cites this paper
Trait-Based Hypothesis Selection For Machine Translation
2012influential citation
Research on the Translation Model Combination Based on Hypergraph
2012cites this paper
Preservation of Recognizability for Weighted Linear Extended Top-down Tree Transducers Deciding the Twins Property for Weighted Tree Automata over Extremal Semifields Ttt: a Tree Transduction Language for Syntactic and Semantic Processing Second Position Clitics and Monadic Second-order Transduction
2012cites this paper
Probabilistic inference for phrase-based machine translation : a sampling approach
2011cites this paper
Minimum Imputed-Risk: Unsupervised Discriminative Training for Machine Translation
2011influential citation
Hypergraph Training and Decoding of System Combination in SMT
2011influential citation
A Decoding Method of System Combination Based on Hypergraph in SMT
2011cites this paper
Automatically Improved Category Labels for Syntax-Based Statistical Machine Translation
2011cites this paper
A non-parametric model for the discovery of inflectional paradigms from plain text using graphical models over strings
2011cites this paper
Fast Generation of Translation Forest for Large-Scale SMT Discriminative Training
2011cites this paper
Third-order Variational Reranking on Packed-Shared Dependency Forests
2011influential citation
Machine Translation from Text
2011cites this paper
Tree Parsing with Synchronous Tree-Adjoining Grammars
2011cites this paper
Minimum Bayes-risk System Combination
2011cites this paper
Maximum Rank Correlation Training for Statistical Machine Translation
2011cites this paper
A Formal Model of Ambiguity and its Applications in Machine Translation
2010cites this paper
Phrase Alignment Models for Statistical Machine Translation
2010cites this paper
Expected Sequence Similarity Maximization
2010cites this paper
n-Best Parsing Revisited
2010cites this paper
Weighted Monadic Datalog Tree Transducers ∗
2010cites this paper
Monte Carlo techniques for phrase-based translation
2010cites this paper
Joshua 2.0: A Toolkit for Parsing-Based Machine Translation with Syntax, Semirings, Discriminative Training and Other Goodies
2010cites this paper
Intégration du contexte en traduction statistique à l'aide d'un perceptron à plusieurs couches
2010cites this paper
Fluency Constraints for Minimum Bayes-Risk Decoding of Statistical Machine Translation Lattices
2010cites this paper
Discriminative training and variational decoding in machine translation via novel algorithms for weighted hypergraphs
2010cites this paper
Millstream Systems – a Formal Model for Linking Language Modules by Interfaces N-best Parsing Revisited Workshop Program Preservation of Recognizability for Synchronous Tree Substitution Grammars Requirements on a Tree Transformation Model for Machine Translation Preservation of Recognizability for
2010cites this paper
Model Combination for Machine Translation
2010cites this paper
A Unified Approach to Minimum Risk Training and Decoding
2010cites this paper
Lattice rescoring methods for statistical machine translation
2010cites this paper
Fast Consensus Decoding over Translation Forests
2009cites this paper
Joshua: An Open Source Toolkit for Parsing-Based Machine Translation
2009cites this paper
Graphical Models over Multiple Strings
2009cites this paper
Forest Reranking for Machine Translation with the Perceptron Algorithm
2009cites this paper
Consensus Training for Consensus Decoding in Machine Translation
2009cites this paper
Demonstration of Joshua: An Open Source Toolkit for Parsing-based Machine Translation
2009influential citation
First- and Second-Order Expectation Semirings with Applications to Minimum-Risk Training on Translation Forests
2009influential citation
UvA-DARE (Digital Academic Repository) A Distributed Inflection Model for Translating into Morphologically Rich Languages
year unknowncites this paper
UvA-DARE (Digital Academic Repository) A Distributed Inflection Model for Translating into Morphologically Rich Languages
year unknowncites this paper
UvA-DARE (Digital Academic Repository) A Distributed Inflection Model for Translating into Morphologically Rich Languages
year unknowncites this paper
UvA-DARE (Digital Academic Repository) A Distributed Inflection Model for Translating into Morphologically Rich Languages
year unknowncites this paper
UvA-DARE (Digital Academic Repository) A Distributed Inflection Model for Translating into Morphologically Rich Languages
year unknowncites this paper
UvA-DARE (Digital Academic Repository) A Distributed Inflection Model for Translating into Morphologically Rich Languages
year unknowncites this paper
UvA-DARE (Digital Academic Repository) A Distributed Inflection Model for Translating into Morphologically Rich Languages
year unknowncites this paper
UvA-DARE (Digital Academic Repository) A Distributed Inflection Model for Translating into Morphologically Rich Languages
year unknowncites this paper
UvA-DARE (Digital Academic Repository) A Distributed Inflection Model for Translating into Morphologically Rich Languages
year unknowncites this paper
UvA-DARE (Digital Academic Repository) A Distributed Inflection Model for Translating into Morphologically Rich Languages
year unknowncites this paper
UvA-DARE (Digital Academic Repository) A Distributed Inflection Model for Translating into Morphologically Rich Languages
year unknowncites this paper
UvA-DARE (Digital Academic Repository) A Distributed Inflection Model for Translating into Morphologically Rich Languages
year unknowncites this paper
UvA-DARE (Digital Academic Repository) A Distributed Inflection Model for Translating into Morphologically Rich Languages
year unknowncites this paper
UvA-DARE (Digital Academic Repository) A Distributed Inflection Model for Translating into Morphologically Rich Languages
year unknowncites this paper
UvA-DARE (Digital Academic Repository) A Distributed Inflection Model for Translating into Morphologically Rich Languages
year unknowncites this paper
UvA-DARE (Digital Academic Repository) A Distributed Inflection Model for Translating into Morphologically Rich Languages
year unknowncites this paper
UvA-DARE (Digital Academic Repository) A Distributed Inflection Model for Translating into Morphologically Rich Languages
year unknowncites this paper
UvA-DARE (Digital Academic Repository) A Distributed Inflection Model for Translating into Morphologically Rich Languages
year unknowncites this paper
UvA-DARE (Digital Academic Repository) A Distributed Inflection Model for Translating into Morphologically Rich Languages
year unknowncites this paper
UvA-DARE (Digital Academic Repository) A Distributed Inflection Model for Translating into Morphologically Rich Languages
year unknowncites this paper
UvA-DARE (Digital Academic Repository) A Distributed Inflection Model for Translating into Morphologically Rich Languages
year unknowncites this paper
UvA-DARE (Digital Academic Repository) A Distributed Inflection Model for Translating into Morphologically Rich Languages
year unknowncites this paper
UvA-DARE (Digital Academic Repository) A Distributed Inflection Model for Translating into Morphologically Rich Languages
year unknowncites this paper