Decoding with Large-Scale Neural Language Models Improves Translation

Ashish Vaswani,Yinggong Zhao,Victoria Fossum,David Chiang

Published 2013 in Conference on Empirical Methods in Natural Language Processing

ABSTRACT

We explore the application of neural language models to machine translation. We develop a new model that combines the neural probabilistic language model of Bengio et al., rectified linear units, and noise-contrastive estimation, and we incorporate it into a machine translation system both by reranking k-best lists and by direct integration into the decoder. Our large-scale, large-vocabulary experiments across four language pairs show that our neural language model improves translation quality by up to 1.1 Bleu.

PUBLICATION RECORD

Publication year
2013
Venue
Conference on Empirical Methods in Natural Language Processing
Publication date
2013-10-01
Fields of study
Linguistics, Computer Science
Identifiers
DOI 10.18653/v1/d13-1140
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

An Empirical Study of Smoothing Techniques for Language Modeling
2015cited by this paper
On rectified linear units for speech processing
2013cited by this paper
CSLM - a modular open-source continuous space language modeling toolkit
2013cited by this paper
Language Model Rest Costs and Space-Efficient Storage
2012cited by this paper
A fast and simple algorithm for training neural probabilistic language models
2012cited by this paper
Practical Recommendations for Gradient-Based Training of Deep Architectures
2012cited by this paper
Continuous space language models using restricted Boltzmann machines
2012cited by this paper
Large, Pruned or Continuous Space Language Models on a GPU for Statistical Machine Translation
2012cited by this paper
Statistical Language Models Based on Neural Networks
2012cited by this paper
Empirical Evaluation and Combination of Advanced Language Modeling Techniques
2011cited by this paper
Structured Output Layer neural network language model
2011cited by this paper
Rectified Linear Units Improve Restricted Boltzmann Machines
2010influential reference
Noise-contrastive estimation: A new estimation principle for unnormalized statistical models
2010influential reference
A Scalable Hierarchical Distributed Language Model
2008cited by this paper
Hierarchical Phrase-Based Translation
2007cited by this paper
Continuous space language models
2007influential reference
Continuous Space Language Models for Statistical Machine Translation
2006cited by this paper
Training Neural Network Language Models on Very Large Corpora
2005cited by this paper
Hierarchical Probabilistic Neural Network Language Model
2005cited by this paper
Efficient training of large neural networks for language modeling
2004cited by this paper
A Neural Probabilistic Language Model
2003influential reference
A Systematic Comparison of Various Statistical Alignment Models
2003influential reference
Minimum Error Rate Training in Statistical Machine Translation
2003influential reference
An Empirical Study of Smoothing Techniques for Language Modeling
1996cited by this paper
On the Alias Method for Generating Random Variables From a Discrete Distribution
1979cited by this paper

CITED BY

A Comprehensive Review of LLMs: Architecture, Performance, and Applications
2025cites this paper
Adaptive Sparse Softmax: An Effective and Efficient Softmax Variant
2025cites this paper
Language Models for Multi-Lingual Tasks - A Survey
2024cites this paper
Generative artificial intelligence and academic writing: an analysis of the perceptions of researchers in training
2024cites this paper
Can General-Purpose Large Language Models Generalize to English-Thai Machine Translation ?
2024cites this paper
Malayalam Natural Language Processing: Challenges in Building a Phrase-Based Statistical Machine Translation System
2023cites this paper
Unraveling the landscape of large language models: a systematic review and future perspectives
2023cites this paper
How Big Can It Get? A comparative analysis of LLMs in architecture and scaling
2023cites this paper
An empirical analysis on statistical and neural machine translation system for English to Mizo language
2023cites this paper
A Survey on the Possibilities & Impossibilities of AI-generated Text Detection
2023cites this paper
Pelatihan Decoding Texts untuk Mengembangkan Penguasaan Kosakata Bahasa Inggris dan Kemampuan Penerjemahan Siswa
2023cites this paper
Towards Possibilities & Impossibilities of AI-generated Text Detection: A Survey
2023cites this paper
Language-Driven Representation Learning for Robotics
2023cites this paper
Improving paraphrase generation using supervised neural-based statistical machine translation framework
2023cites this paper
A Fuzzy Training Framework for Controllable Sequence-to-Sequence Generation
2022cites this paper
Improved Unsupervised Statistical Machine Translation via Unsupervised Word Sense Disambiguation for a Low-Resource and Indic Languages
2022cites this paper
Leveraging machine translation for cross-lingual fine-grained cyberbullying classification amongst pre-adolescents
2022influential citation
Language mapping functions: Improving softmax estimation and word embedding quality
2021cites this paper
Towards User-Driven Neural Machine Translation
2021cites this paper
A comprehensive survey on machine translation for English, Hindi and Sanskrit languages
2021cites this paper
Scalable Data Management on Hybrid Memory System for Deep Neural Network Applications
2021cites this paper
Transfer Learning for Multi-lingual Tasks - a Survey
2021cites this paper
Speeding Up Entmax
2021cites this paper
Parallel feature weight decay algorithms for fast development of machine translation models
2021influential citation
An empirical analysis of phrase-based and neural machine translation
2021cites this paper
Domain Adaptation and Multi-Domain Adaptation for Neural Machine Translation: A Survey
2021cites this paper
Analysing terminology translation errors in statistical and neural machine translation
2020cites this paper
Group-wise Contrastive Learning for Neural Dialogue Generation
2020cites this paper
Pretrained Generalized Autoregressive Model with Adaptive Probabilistic Label Clusters for Extreme Multi-label Text Classification
2020cites this paper
The Roles of Language Models and Hierarchical Models in Neural Sequence-to-Sequence Prediction
2020influential citation
Assessment of Table Pruning and Semantic Interpretation for Sentiment Analysis Using BRAE Algorithm
2020cites this paper
A Survey of Deep Learning Techniques for Neural Machine Translation
2020cites this paper
Neural Machine Translation
2020cites this paper
Semantically Smooth Bilingual Phrase Embeddings Based on Recursive Autoencoders
2020cites this paper
Neural methods for effective, efficient, and exposure-aware information retrieval
2020cites this paper
Efficient Neural Query Auto Completion
2020cites this paper
Teaching Machines to Converse
2020cites this paper
Noise-Contrastive Estimation for Multivariate Point Processes
2020cites this paper
Multi-Source Transformer for Kazakh-Russian-English Neural Machine Translation
2019cites this paper
TermEval: an automatic metric for evaluating terminology translation in MT
2019cites this paper
Data Diversification: A Simple Strategy For Neural Machine Translation
2019cites this paper
JUMT at WMT2019 News Translation Task: A Hybrid Approach to Machine Translation for Lithuanian to English
2019cites this paper
Deep Learning for Natural Language Processing
2019cites this paper
JUCBNMT at WMT2018 News Translation Task: Character Based Neural Machine Translation of Finnish to English
2019influential citation
Improving Word Representations: A Sub-sampled Unigram Distribution for Negative Sampling
2019cites this paper
2 Connections Between Word Embeddings and Topic Models
2019cites this paper
Neural Embedding Allocation: Distributed Representations of Topic Models
2019cites this paper
Self-attention based end-to-end Hindi-English Neural Machine Translation
2019cites this paper
Terminology Translation in Low-Resource Scenarios
2019cites this paper
Investigating Terminology Translation in Statistical and Neural Machine Translation: A Case Study on English-to-Hindi and Hindi-to-English
2019cites this paper
Distributionally Robust Language Modeling
2019cites this paper
Hybrid Machine Translation by Combining Output from Multiple Machine Translation Systems
2019cites this paper
Reducing Word Omission Errors in Neural Machine Translation: A Contrastive Learning Approach
2019cites this paper
Sequence-to-sequence learning for machine translation and automatic differentiation for machine learning software tools
2019cites this paper
Automatic Post-Editing for Machine Translation
2019influential citation
Scalable Multi Corpora Neural Language Models for ASR
2019cites this paper
CADENCE: Conditional Anomaly Detection for Events Using Noise-Contrastive Estimation
2019cites this paper
Data Diversification: An Elegant Strategy For Neural Machine Translation
2019cites this paper
FineText: Text Classification via Attention-based Language Model Fine-tuning
2019cites this paper
Deep Learning: Fundamentals, Theory and Applications
2019cites this paper
Machine Translation with parfda, Moses, kenlm, nplm, and PRO
2019cites this paper
MTIL2017: Machine Translation Using Recurrent Neural Network on Statistical Machine Translation
2019cites this paper
Flow Contrastive Estimation of Energy-Based Models
2019cites this paper
Algorithmes à base d’échantillonage pour l’entraînement de modèles de langue neuronaux (Here the title in English)
2018cites this paper
Predicting and discovering linguistic structure with neural networks
2018cites this paper
An Introduction to Neural Information Retrieval
2018cites this paper
Why not be Versatile? Applications of the SGNMT Decoder for Machine Translation
2018cites this paper
From Feature to Paradigm: Deep Learning in Machine Translation (Extended Abstract)
2018cites this paper
Going beyond the sentence: Contextual Machine Translation of Dialogue. (Au-delà de la phrase: traduction automatique de dialogue en contexte)
2018cites this paper
Syntax-Aware Language Modeling with Recurrent Neural Networks
2018cites this paper
A Neural Approach to Source Dependence Based Context Model for Statistical Machine Translation
2018cites this paper
Syntax-Based Context Representation for Statistical Machine Translation
2018influential citation
SMT vs NMT: A Comparison over Hindi and Bengali Simple Sentences
2018cites this paper
Machine Translation : From Statistical to modern Deep-learning practices
2018cites this paper
Alibaba Submission to the WMT18 Parallel Corpus Filtering Task
2018cites this paper
Large-Scale Bisample Learning on ID Versus Spot Face Recognition
2018cites this paper
Context-Aware Phrase Representation for Statistical Machine Translation
2018cites this paper
Large-scale Bisample Learning on ID vs. Spot Face Recognition
2018cites this paper
Learning with Noise-Contrastive Estimation: Easing training by learning to scale
2018influential citation
Post-editing Effort of a Novel With Statistical and Neural Machine Translation
2018cites this paper
Whole Sentence Neural Language Models
2018cites this paper
Self-Normalization Properties of Language Modeling
2018influential citation
Adversarial Contrastive Estimation
2018cites this paper
Adaptive Input Representations for Neural Language Modeling
2018cites this paper
What Level of Quality can Neural Machine Translation Attain on Literary Text?
2018influential citation
Accelerated Training for Massive Classification via Dynamic Class Selection
2018cites this paper
Continuous Space Reordering Models for Phrase-based MT
2018cites this paper
Actes de la conférence Traitement Automatique de la Langue Naturelle, TALN 2018
2018cites this paper
2 Previous Work 2 . 1 Neural Language Modeling with LSTMs
2018cites this paper
Machine Translation 10 : Advanced Neural Machine Translation Architectures
2018cites this paper
Learning from past mistakes: improving automatic speech recognition output via noisy-clean phrase context modeling
2018cites this paper
Fast Locality Sensitive Hashing for Beam Search on GPU
2018cites this paper
Improved Training Of Neural Trans-Dimensional Random field Language Models with Dynamic Noise-Contrastive Estimation
2018cites this paper
Language Modeling for Morphologically Rich Languages: Character-Aware Modeling for Word-Level Prediction
2018cites this paper
Syntactic and semantic features for statistical and neural machine translation
2018influential citation
Typologically robust statistical machine translation : Understanding and exploiting differences and similarities between languages in machine translation
2018cites this paper
Deep Learning in Machine Translation
2018cites this paper
Translators’ perceptions of literary post-editing using statistical and neural machine translation
2018cites this paper
Frustratingly Easy Model Ensemble for Abstractive Summarization
2018cites this paper
Neural language models: Dealing with large vocabularies. (Modèles de langue neuronaux: Gestion des grands vocabulaires)
2018cites this paper