Unsupervised Token-wise Alignment to Improve Interpretation of Encoder-Decoder Models

Shun Kiyono,Sho Takase,Jun Suzuki,Naoaki Okazaki,Kentaro Inui,M. Nagata

Published 2018 in BlackboxNLP@EMNLP

ABSTRACT

Developing a method for understanding the inner workings of black-box neural methods is an important research endeavor. Conventionally, many studies have used an attention matrix to interpret how Encoder-Decoder-based models translate a given source sentence to the corresponding target sentence. However, recent studies have empirically revealed that an attention matrix is not optimal for token-wise translation analyses. We propose a method that explicitly models the token-wise alignment between the source and target sequences to provide a better analysis. Experiments show that our method can acquire token-wise alignments that are superior to those of an attention mechanism.

PUBLICATION RECORD

Publication year
2018
Venue
BlackboxNLP@EMNLP
Publication date
2018-11-01
Fields of study
Computer Science
Identifiers
DOI 10.18653/v1/W18-5410
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Six Challenges for Neural Machine Translation
2017cited by this paper
What does Attention in Neural Machine Translation Pay Attention to?
2017cited by this paper
Visualizing and Understanding Neural Machine Translation
2017cited by this paper
When to Finish? Optimal Beam Search for Neural Text Generation (modulo beam size)
2017cited by this paper
Selective Encoding for Abstractive Sentence Summarization
2017cited by this paper
Modeling Coverage for Neural Machine Translation
2016cited by this paper
Cutting-off Redundant Repeating Generations for Neural Abstractive Summarization
2016influential reference
A Dataset and Evaluation Metrics for Abstractive Compression of Sentences and Short Paragraphs
2016cited by this paper
Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond
2016cited by this paper
Neural Machine Translation with Supervised Attention
2016cited by this paper
Effective Approaches to Attention-based Neural Machine Translation
2015influential reference
Neural Machine Translation of Rare Words with Subword Units
2015influential reference
Neural Responding Machine for Short-Text Conversation
2015cited by this paper
A Neural Attention Model for Abstractive Sentence Summarization
2015influential reference
Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation
2014cited by this paper
Sequence to Sequence Learning with Neural Networks
2014cited by this paper
Neural Machine Translation by Jointly Learning to Align and Translate
2014cited by this paper
Adam: A Method for Stochastic Optimization
2014cited by this paper
Annotated Gigaword
2012cited by this paper
Alignment by Agreement
2006cited by this paper
Long Short-Term Memory
1997cited by this paper

CITED BY

Explainable machine learning in cybersecurity: A survey
2022cites this paper
A Survey of the State of Explainable AI for Natural Language Processing
2020cites this paper
A Large-Scale Multi-Length Headline Corpus for Analyzing Length-Constrained Headline Generation Model Evaluation
2019cites this paper
Improving Latent Alignment in Text Summarization by Generalizing the Pointer Generator
2019cites this paper
Improved Recurrent Neural Networks (RNN) Based Intelligent Fund Transaction Model
2019cites this paper