A Deep Reinforced Model for Abstractive Summarization

Published 2017 in International Conference on Learning Representations

ABSTRACT

Attentional, RNN-based encoder-decoder models for abstractive summarization have achieved good performance on short input and output sequences. For longer documents and summaries however these models often include repetitive and incoherent phrases. We introduce a neural network model with a novel intra-attention that attends over the input and continuously generated output separately, and a new training method that combines standard supervised word prediction and reinforcement learning (RL). Models trained only with supervised learning often exhibit "exposure bias" - they assume ground truth is provided at each step during training. However, when standard word prediction is combined with the global sequence prediction training of RL the resulting summaries become more readable. We evaluate this model on the CNN/Daily Mail and New York Times datasets. Our model obtains a 41.16 ROUGE-1 score on the CNN/Daily Mail dataset, an improvement over previous state-of-the-art models. Human evaluation also shows that our model produces higher quality summaries.

PUBLICATION RECORD

Publication year
2017
Venue
International Conference on Learning Representations
Publication date
2017-05-11
Fields of study
Computer Science
Identifiers
arXiv 1705.04304
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Get To The Point: Summarization with Pointer-Generator Networks
2017cited by this paper
Abstractive Sentence Summarization with Attentive Recurrent Neural Networks
2016cited by this paper
Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling
2016cited by this paper
Self-Critical Sequence Training for Image Captioning
2016cited by this paper
Pointer Sentinel Mixture Models
2016cited by this paper
Latent Predictor Networks for Code Generation
2016cited by this paper
Using the Output Embedding to Improve Language Models
2016cited by this paper
Distraction-Based Neural Networks for Modeling Document
2016cited by this paper
Efficient Summarization with Read-Again and Copy Mechanism
2016cited by this paper
How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation
2016cited by this paper
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
2016cited by this paper
Reward Augmented Maximum Likelihood for Neural Structured Prediction
2016cited by this paper
The Role of Discourse Units in Near-Extractive Summarization
2016cited by this paper
Learning-Based Single-Document Summarization with Compression and Anaphoricity Constraints
2016cited by this paper
Temporal Attention Model for Neural Machine Translation
2016cited by this paper
Improving the Estimation of Word Importance for News Multi-Document Summarization - Extended Technical Report
2016cited by this paper
Pointing the Unknown Words
2016cited by this paper
SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents
2016cited by this paper
Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond
2016cited by this paper
Long Short-Term Memory-Networks for Machine Reading
2016cited by this paper
RECURRENT NEURAL NETWORKS
2015cited by this paper
A Neural Attention Model for Abstractive Sentence Summarization
2015cited by this paper
Pointer Networks
2015cited by this paper
Teaching Machines to Read and Comprehend
2015cited by this paper
System Combination for Multi-document Summarization
2015cited by this paper
Improving Multi-Step Prediction of Learned Time Series Models
2015cited by this paper
HEADS: Headline Generation as Sequence Prediction Using an Abstract Feature-Rich Space
2015cited by this paper
大規模要約資源としてのNew York Times Annotated Corpus
2015influential reference
Sequence to Sequence Learning with Neural Networks
2014cited by this paper
Neural Machine Translation by Jointly Learning to Align and Translate
2014cited by this paper
The Stanford CoreNLP Natural Language Processing Toolkit
2014influential reference
GloVe: Global Vectors for Word Representation
2014cited by this paper
Adam: A Method for Stochastic Optimization
2014influential reference
Detecting Information-Dense Texts in Multiple News Domains
2014cited by this paper
Distributed Representations of Words and Phrases and their Compositionality
2013cited by this paper
Overcoming the Lack of Parallel Data in Sentence Compression
2013cited by this paper
ROUGE: A Package for Automatic Evaluation of Summaries
2004cited by this paper
Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning
2004cited by this paper
Hedge Trimmer: A Parse-and-Trim Approach to Headline Generation
2003cited by this paper
Automatic Text Summarization Using a Machine Learning Approach
2002influential reference
Long Short-Term Memory
1997influential reference
A Learning Algorithm for Continually Running Fully Recurrent Neural Networks
1989cited by this paper
Identiﬁcation and Characterization of Newsworthy Verbs in World News
year unknowncited by this paper

CITED BY

A Deep Learning Framework for Extracting and Summarizing Text from Images
2026cites this paper
UltraLogic: Enhancing LLM Reasoning through Large-Scale Data Synthesis and Bipolar Float Reward
2026cites this paper
FlashEvaluator: Expanding Search Space with Parallel Evaluation
2026cites this paper
Alignment-Aware Model Adaptation via Feedback-Guided Optimization
2026cites this paper
Learning to Reason as Action Abstractions with Scalable Mid-Training RL
2025cites this paper
Contrastive Learning for Improved Abstractive Sentence Summarization
2025cites this paper
Preference Learning with Lie Detectors can Induce Honesty or Evasion
2025cites this paper
Chatbot Deployment Considerations for Application-Agnostic Human-Machine Dialogues
2025cites this paper
Lite Mongolian-Chinese Neural Machine Translation: Dynamic Convolution with Long-Range Attention
2025cites this paper
Towards Efficient LLM Inference via Collective and Adaptive Speculative Decoding
2025cites this paper
Enhancing sentiment analysis of moroccan dialect through transformer-based language models architectures and active learning strategies
2025cites this paper
An Extractive Text News Summarization: A Hybrid Optimization with Ensemble Learning Approach
2025cites this paper
The Origin of Self-Attention: Pairwise Affinity Matrices in Feature Selection and the Emergence of Self-Attention
2025cites this paper
Planning without Search: Refining Frontier LLMs with Offline Goal-Conditioned RL
2025cites this paper
Current Approaches in Abstractive Text Summarization: A Comprehensive Survey and Analysis
2025cites this paper
DiscoSum: Discourse-aware News Summarization
2025cites this paper
A Reinforcement Learning-Based Generative Approach for Event Temporal Relation Extraction
2025cites this paper
RIVAL: Reinforcement Learning with Iterative and Adversarial Optimization for Machine Translation
2025cites this paper
Text Summarization with LLM: A Comparison of Transformer and Non-Transformer Models
2025cites this paper
Effective summarization of ChatGPT user feedback: integrating topic detection with Markov chains
2025cites this paper
Reward Models are Metrics in a Trench Coat
2025cites this paper
Intent-aware personalized summarization for educational texts with large language models
2025cites this paper
RLHFSpec: Breaking the Efficiency Bottleneck in RLHF Training via Adaptive Drafting
2025cites this paper
Temporal and Causal-Aware Summarization of Legal Public Opinion With Large Language Models
2025cites this paper
DEG-Sum: Discourse-aware Event Graph Summarization for News Texts
2025cites this paper
Abstractive Text Summarization: A Systematic Review of Techniques, Evaluation, and Future Directions
2025cites this paper
Topic-Guided Reinforcement Learning with LLMs for Enhancing Multi-Document Summarization
2025cites this paper
Sem-Rouge: Graph-Based Embedding for Automated Text Summarization with Using Large Language Models
2025cites this paper
RA-EPNet: A novel network fusing residual axial attention and edge prediction for medical image segmentation
2025cites this paper
LMR-IPGN: An Effective Model for automatic summarization of Chinese text
2025cites this paper
GKG-LLM: A Unified Framework for Generalized Knowledge Graph Construction
2025cites this paper
Warmup Generations: A Task-Agnostic Approach for Guiding Sequence-to-Sequence Learning with Unsupervised Initial State Generation
2025cites this paper
When Evolution Strategy Meets Language Models Tuning
2025cites this paper
Design and Research of Intelligent Chatbot for Campus Information Consultation Assistant
2025cites this paper
Rethinking Natural Language Generation with Layer-Wise Multi-View Decoding
2025cites this paper
Sentence Embeddings as an intermediate target in end-to-end summarisation
2025cites this paper
Alleviating Chinese repetitive generation via intra and intersentence penalty
2025cites this paper
APG: Automatic Prompt Generation for Improved Document Summarization
2025cites this paper
Fine-Tuning LLaMA 3.2-1B for Long-Text Summarization: A Case Study on Book Summarization
2025cites this paper
Text Summarization Using PEGASUS Transformer Model in Machine Learning
2025cites this paper
TalkLess: Blending Extractive and Abstractive Summarization for Editing Speech to Preserve Content and Style
2025cites this paper
RETO: Reinforcement learning enhanced terminology optimization for cyber threat intelligence summarization
2025cites this paper
Exploring Visual Information Enhancement for Multimodal Customized Opinion Generation
2025cites this paper
A survey of automatic text summarization: concepts, advances and future prospects
2025cites this paper
Gated Graph Neural Networks with Attention for Abstractive Summarization of Scientific Documents
2025cites this paper
A review of AI-based business lead generation: Scrapus as a case study
2025cites this paper
Deep Learning Methods for Text Summarization
2025cites this paper
Analysis of the Effectiveness of Iterative Prompts in the Integration of Classification and Summarization of User Reports Based on NLP
2025cites this paper
Abstractive summarization using multilingual text-to-text transfer transformer for the Turkish text
2025cites this paper
Past-Future Scheduler for LLM Serving under SLA Guarantees
2025cites this paper
CoMuMDR: Code-mixed Multi-modal Multi-domain corpus for Discourse paRsing in conversations
2025cites this paper
Digitalisasi Pengelolaan Dana Desa di Desa Bina Baru Kabupaten Sidenreng Rappang
2025cites this paper
Abstractive summarization through the prism of decoding strategies
2025cites this paper
iFuzzyTL: Interpretable Fuzzy Transfer Learning for Steady-State Visual Evoked Potentials Brain–Computer Interfaces System
2025cites this paper
Document Summarization with Conformal Importance Guarantees
2025cites this paper
Smart Trial: Evaluating the Use of Large Language Models for Recruiting Clinical Trial Participants via Social Media
2025cites this paper
TracSum: A New Benchmark for Aspect-Based Summarization with Sentence-Level Traceability in Medical Domain
2025cites this paper
Enhancing Abstractive Summarization with T5SAM: A Transformer-Based Approach
2025cites this paper
A Novel Approach to Web Article Summarization
2025cites this paper
PEGASUS-XL with saliency-guided scoring and long-input encoding for multi-document abstractive summarization
2025cites this paper
HyFit: Hybrid Fine-Tuning With Diverse Sampling for Abstractive Summarization
2025cites this paper
Introducing bidirectional attention for autoregressive models in abstractive summarization
2025cites this paper
Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
2025cites this paper
Abstractive Summarization of Historical Documents: A New Dataset and Novel Method Using a Domain-Specific Pretrained Model
2025cites this paper
A novel hybrid architecture for video frame prediction: combining convolutional LSTM and 3D CNN
2025cites this paper
Ontology-based prompt tuning for news article summarization
2025cites this paper
Fine-tuning text-to-SQL models with reinforcement-learning training objectives
2025cites this paper
Position: General Intelligence Requires Reward-based Pretraining
2025cites this paper
Monitoring Reasoning Models for Misbehavior and the Risks of Promoting Obfuscation
2025cites this paper
Priority-Aware Preemptive Scheduling for Mixed-Priority Workloads in MoE Inference
2025cites this paper
Abstractive text summarization using deep learning models: a survey
2025cites this paper
Study on the standardization method of radiotelephony communication in low-altitude airspace based on BART
2025cites this paper
Penerapan E-Government dalam Digitalisasi Pelayanan Publik di Kantor Desa Sipodeceng
2025cites this paper
NLP with Deep Learning Approaches in Text Generation
2025cites this paper
PLSRP: prompt learning for send–receive path prediction
2024cites this paper
Model-based Preference Optimization in Abstractive Summarization without Human Feedback
2024influential citation
A Fuzzy Logic-Based Approach to Predict Human Interaction by Functional Near-Infrared Spectroscopy
2024cites this paper
Autoregressive Multi-trait Essay Scoring via Reinforcement Learning with Scoring-aware Multiple Rewards
2024cites this paper
SyncIntellects: Orchestrating LLM Inference with Progressive Prediction and QoS-Friendly Control
2024cites this paper
Mitigating the negative impact of over-association for conversational query production
2024cites this paper
Generating Attractive Ad Text by Facilitating the Reuse of Landing Page Expressions
2024influential citation
CLIP-based Semantic Enhancement and Vocabulary Expansion for Video Captioning Using Reinforcement Learning
2024cites this paper
Detection of Quora Question Pair with Same Intent
2024cites this paper
Reinforcement Learning without Human Feedback for Last Mile Fine-Tuning of Large Language Models
2024cites this paper
VATMAN: Integrating Video-Audio-Text for Multimodal Abstractive SummarizatioN via Crossmodal Multi-Head Attention Fusion
2024cites this paper
Text summarization for pharmaceutical sciences using hierarchical clustering with a weighted evaluation methodology
2024cites this paper
Language Models Learn to Mislead Humans via RLHF
2024cites this paper
Heterogeneous graphormer for extractive multimodal summarization
2024cites this paper
Fact-Aware Generative Text Summarization with Dependency Graphs
2024cites this paper
Chinese Text Summarization Based on Multi-Layer Attention
2024influential citation
REInstruct: Building Instruction Data from Unlabeled Corpus
2024influential citation
Recent Advances in Multi-Choice Machine Reading Comprehension: A Survey on Methods and Datasets
2024cites this paper
Automatic Pull Request Description Generation Using LLMs: A T5 Model Approach
2024cites this paper
Leveraging Automated POS Tagging to Decode Parent-Infant Interactions in Digital Gameplays
2024cites this paper
DutyTTE: Deciphering Uncertainty in Origin-Destination Travel Time Estimation
2024cites this paper
Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Generation
2024cites this paper
The motion forecasting study of floating offshore wind turbine using self-attention long short-term memory method
2024cites this paper
Comprehensive Survey of Abstractive Text Summarization Techniques
2024cites this paper
Abstractive text summarization: State of the art, challenges, and improvements
2024cites this paper
Deep-Learning-Based Pre-Training and Refined Tuning for Web Summarization Software
2024cites this paper