Modeling Coverage for Neural Machine Translation

Zhaopeng Tu,Zhengdong Lu,Yang Liu,Xiaohua Liu,Hang Li

Published 2016 in Annual Meeting of the Association for Computational Linguistics

ABSTRACT

Attention mechanism has enhanced state-of-the-art Neural Machine Translation (NMT) by jointly learning to align and translate. It tends to ignore past alignment information, however, which often leads to over-translation and under-translation. To address this problem, we propose coverage-based NMT in this paper. We maintain a coverage vector to keep track of the attention history. The coverage vector is fed to the attention model to help adjust future attention, which lets NMT system to consider more about untranslated source words. Experiments show that the proposed approach significantly improves both translation quality and alignment quality over standard attention-based NMT.

PUBLICATION RECORD

Publication year
2016
Venue
Annual Meeting of the Association for Computational Linguistics
Publication date
2016-01-19
Fields of study
Computer Science
Identifiers
DOI 10.18653/v1/P16-1008 arXiv 1601.04811
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Implicit Distortion and Fertility Models for Attention-based Encoder-Decoder NMT Model
2016cited by this paper
Incorporating Structural Alignment Biases into an Attentional Neural Translation Model
2016cited by this paper
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
2015cited by this paper
Agreement-Based Joint Training for Bidirectional Attention-Based Neural Machine Translation
2015cited by this paper
Minimum Risk Training for Neural Machine Translation
2015influential reference
Effective Approaches to Attention-based Neural Machine Translation
2015cited by this paper
Contrastive Unsupervised Word Alignment with Non-Local Features
2014cited by this paper
Neural Machine Translation by Jointly Learning to Align and Translate
2014influential reference
On the Properties of Neural Machine Translation: Encoder–Decoder Approaches
2014cited by this paper
On Using Very Large Target Vocabulary for Neural Machine Translation
2014cited by this paper
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
2014cited by this paper
Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation
2014influential reference
Sequence to Sequence Learning with Neural Networks
2014cited by this paper
Recurrent Continuous Translation Models
2013cited by this paper
Fluency, Adequacy, or HTER? Exploring Different Human Judgments with a Tunable MT Metric
2009cited by this paper
Hierarchical Phrase-Based Translation
2007cited by this paper
Moses: Open Source Toolkit for Statistical Machine Translation
2007cited by this paper
Alignment by Agreement
2006cited by this paper
Clause Restructuring for Statistical Machine Translation
2005cited by this paper
A Neural Probabilistic Language Model
2003cited by this paper
A Systematic Comparison of Various Statistical Alignment Models
2003cited by this paper
Minimum Error Rate Training in Statistical Machine Translation
2003cited by this paper
Statistical Phrase-Based Translation
2003influential reference
Bleu: a Method for Automatic Evaluation of Machine Translation
2002cited by this paper
Bidirectional recurrent neural networks
1997influential reference
Long Short-Term Memory
1997cited by this paper
The Mathematics of Statistical Machine Translation: Parameter Estimation
1993cited by this paper

CITED BY

LLM-Driven Completeness and Consistency Evaluation for Cultural Heritage Data Augmentation in Cross-Modal Retrieval
2025cites this paper
MFH: Marrying Frequency Domain with Handwritten Mathematical Expression Recognition
2025cites this paper
Just One is Enough: An Existence-based Alignment Check for Robust Japanese Pronunciation Estimation
2025cites this paper
SLICET5: Static Program Slicing using Language Models with Copy Mechanism and Constrained Decoding
2025cites this paper
An Empirical Study on Chinese Character Decomposition in Multiword Expression-Aware Neural Machine Translation
2025cites this paper
Machine Translation vs. Human Translation: A Linguistic Analysis
2025cites this paper
Explainable artificial intelligence in the talent recruitment process-a literature review
2025cites this paper
Virtual Reality (VR) Paradigm-Agnostic Motor Imagery Decoding Using Lightweight Network With Adaptive Attention Mechanism
2025cites this paper
Towards the implementation of automated scoring in international large-scale assessments: Scalability and quality control
2025cites this paper
Converting Images of Mathematical Equation into Latex Equivalent
2025cites this paper
Hallucinations in LLMs and Resolving Them: A Holistic Approach
2025cites this paper
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs
2025cites this paper
The Ethical Paradox of AI-Generated Texts: Investigating the Moral Responsibility in Generative Models
2025cites this paper
WHATSNet: a wavelet-guided hybrid attention token selector for handwritten mathematical expression recognition
2025cites this paper
LAMGCN:Traditional Chinese Medicine Herb Recommendation via LSTMs with Attention Mechanisms and Graph Convolutional Networks
2025cites this paper
Advancements in Machine Translation and Cross-Language Computational Applications: Techniques, Challenges, and Future Directions
2025cites this paper
Intelligent Sci-Tech Project Review Assistance Framework Based on Natural Language Processing
2025cites this paper
Understanding and Addressing the Under-Translation Problem from the Perspective of Decoding Objective
2024cites this paper
Encoder–Decoder Calibration for Multimodal Machine Translation
2024cites this paper
Investigating the translation capabilities of Large Language Models trained on parallel data only
2024cites this paper
Transformer Machine Translation Model Incorporating Word Alignment Structure
2024cites this paper
ICAL: Implicit Character-Aided Learning for Enhanced Handwritten Mathematical Expression Recognition
2024cites this paper
Text summarization based on semantic graphs: an abstract meaning representation graph-to-text deep learning approach
2024cites this paper
Word Alignment as Preference for Machine Translation
2024cites this paper
Neural Methods for Data-to-text Generation
2024cites this paper
Improving Handwritten Mathematical Expression Recognition via Similar Symbol Distinguishing
2024cites this paper
RETRACTED ARTICLE: Beyond the Swipe: Investigating the Interplay of Technology, Media, and Human Behavior in Digital Romance
2024cites this paper
Neural and Statistical Machine Translation: Confronting the State of the Art
2024cites this paper
KGAgent: Learning a Deep Reinforced Agent for Keyphrase Generation
2024cites this paper
Automatic question generation for bahasa indonesia examination using copynet
2024cites this paper
Research on Question Generation Based on Free-text
2024cites this paper
MEMix: Improving HMER with Diverse Formula Structure Augmentation
2024cites this paper
RFL: Simplifying Chemical Structure Recognition with Ring-Free Language
2024cites this paper
Enhancing Named Entity Recognition using Deep Learning Approaches
2024cites this paper
Abstractive Text Summarization Based on Neural Fusion
2024cites this paper
Recursive Attentive Pooling For Extracting Speaker Embeddings From Multi-Speaker Recordings
2024cites this paper
Building a Multimodal Dataset of Academic Paper for Keyword Extraction
2024cites this paper
The Future of Human Translation in the Artificial Intelligence Era
2024cites this paper
Automating Comment Generation for Smart Contract from Bytecode
2024cites this paper
Good things come in three: Generating SO Post Titles with Pre-Trained Models, Self Improvement and Post Ranking
2024cites this paper
A Hierarchical Sequence-to-Set Model with Coverage Mechanism for Aspect Category Sentiment Analysis
2024cites this paper
A Concise Survey of OCR for Low-Resource Languages
2024influential citation
Self-Translate-Train: A Simple but Strong Baseline for Cross-lingual Transfer of Large Language Models
2024cites this paper
Multi-mechanism neural machine translation framework for automatic program repair
2024cites this paper
Leveraging Diverse Modeling Contexts With Collaborating Learning for Neural Machine Translation
2024cites this paper
ATSumm: Auxiliary information enhanced approach for abstractive disaster Tweet Summarization with sparse training data
2024cites this paper
LLMs Will Always Hallucinate, and We Need to Live With This
2024cites this paper
A Systematic survey on automated text generation tools and techniques: application, evaluation, and challenges
2023cites this paper
NLG Evaluation Metrics Beyond Correlation Analysis: An Empirical Metric Preference Checklist
2023cites this paper
GCRE-GPT: A Generative Model for Comparative Relation Extraction
2023cites this paper
WSPAlign: Word Alignment Pre-training via Large-Scale Weakly Supervised Span Prediction
2023cites this paper
Towards Automatic Job Description Generation With Capability-Aware Neural Networks
2023cites this paper
Predicting Human Translation Difficulty with Neural Machine Translation
2023cites this paper
Generation of Highlights From Research Papers Using Pointer-Generator Networks and SciBERT Embeddings
2023cites this paper
Read Ten Lines at One Glance: Line-Aware Semi-Autoregressive Transformer for Multi-Line Handwritten Mathematical Expression Recognition
2023cites this paper
ChestXRayBERT: A Pretrained Language Model for Chest Radiology Report Summarization
2023cites this paper
Applications of Shaped-Charge Learning
2023cites this paper
Probabilistic Keyphrase Generation From Copy and Generating Spaces
2023cites this paper
Semantic Parsing for Question Answering over Knowledge Graphs
2023cites this paper
Semantic-Aware Dynamic Retrospective-Prospective Reasoning for Event-Level Video Question Answering
2023cites this paper
A Two-Stage Long Text Summarization Method Based on Discourse Structure
2023cites this paper
Multi-Document Summarization Using Selective Attention Span and Reinforcement Learning
2023cites this paper
Chinese Summary Generation Algorithm Based on Contrastive Attention Mechanism
2023cites this paper
Abstractive summarization with deep reinforcement learning using semantic similarity rewards
2023cites this paper
MAN: Memory-augmented Attentive Networks for Deep Learning-based Knowledge Tracing
2023cites this paper
Research on Generative Text Summarization Fusing Multidimensional Semantic Information
2023cites this paper
Instruction Position Matters in Sequence Generation with Large Language Models
2023cites this paper
Count, Decode and Fetch: A New Approach to Handwritten Chinese Character Error Correction
2023cites this paper
Automatic Skill-Oriented Question Generation and Recommendation for Intelligent Job Interviews
2023cites this paper
HIGHLIGHT GENERATION WITH ELMO CONTEXTUAL EMBEDDINGS
2023cites this paper
A survey on semantic processing techniques
2023cites this paper
Supervised Copy Mechanism for Grammatical Error Correction
2023cites this paper
Joint learning of text alignment and abstractive summarization for long documents via unbalanced optimal transport
2023cites this paper
From Statistical Methods to Deep Learning, Automatic Keyphrase Prediction: A Survey
2023cites this paper
Embedding Context as Code Dependencies for Neural Program Repair
2023cites this paper
Is COVID-19 reflected in AnaCredit dataset? A big data - machine learning approach for analysing behavioural patterns using loan level granular information
2023cites this paper
Rare words in text summarization
2023cites this paper
End-to-End Speech Recognition: A Survey
2023cites this paper
RegRL-KG: Learning an L1 regularized reinforcement agent for keyphrase generation
2023cites this paper
Reinforcement learning-driven deep question generation with rich semantics
2023cites this paper
Membership Inference Attacks With Token-Level Deduplication on Korean Language Models
2023cites this paper
Named Entity Recognition Based Automatic Generation of Research Highlights
2023cites this paper
Improving Grammar-based Sequence-to-Sequence Modeling with Decomposition and Constraints
2023cites this paper
Printed Mathematical Expression Recognition from Document Images Based on Augmentation Techniques and Transformer
2023cites this paper
A Study on Various Approaches Towards Non-Factoid Question Answering Systems
2023cites this paper
Abstractive Financial News Summarization via Transformer-BiLSTM Encoder and Graph Attention-Based Decoder
2023cites this paper
As Little as Possible, as Much as Necessary: Detecting Over- and Undertranslations with Contrastive Conditioning
2022cites this paper
Exploiting Inactive Examples for Natural Language Generation With Data Rejuvenation
2022cites this paper
Hierarchical Context Tagging for Utterance Rewriting
2022cites this paper
Sequential multi‐headed attention for entity‐based relational neural networks
2022cites this paper
Survey of Hallucination in Natural Language Generation
2022influential citation
Challenges of Neural Machine Translation for Short Texts
2022cites this paper
Improving open domain content generation by text mining and alignment
2022cites this paper
Sentence2SignGesture: a hybrid neural machine translation network for sign language video generation
2022cites this paper
Assemble Foundation Models for Automatic Code Summarization
2022cites this paper
Faithfulness in Natural Language Generation: A Systematic Survey of Analysis, Evaluation and Optimization Methods
2022cites this paper
Can Synthetic Translations Improve Bitext Quality?
2022cites this paper
Perplexity from PLM Is Unreliable for Evaluating Text Quality
2022cites this paper
Attention Constraint Mechanism through Auxiliary Attention
2022cites this paper
Chinese text summary generation based on Seq2Seq framework
2022cites this paper