TriSum: Learning Summarization Ability from Large Language Models with Structured Rationale

Pengcheng Jiang,Cao Xiao,Zifeng Wang,Parminder Bhatia,Jimeng Sun,Jiawei Han

Published 2024 in North American Chapter of the Association for Computational Linguistics

ABSTRACT

The advent of large language models (LLMs) has significantly advanced natural language processing tasks like text summarization. However, their large size and computational demands, coupled with privacy concerns in data transmission, limit their use in resource-constrained and privacy-centric settings. To overcome this, we introduce TriSum, a framework for distilling LLMs’ text summarization abilities into a compact, local model. Initially, LLMs extract a set of aspect-triple rationales and summaries, which are refined using a dual-scoring method for quality. Next, a smaller local model is trained with these tasks, employing a curriculum learning strategy that evolves from simple to complex tasks. Our method enhances local model performance on various benchmarks (CNN/DailyMail, XSum, and ClinicalTrial), outperforming baselines by 4.5%, 8.5%, and 7.4%, respectively. It also improves interpretability by providing insights into the summarization rationale.

PUBLICATION RECORD

Publication year
2024
Venue
North American Chapter of the Association for Computational Linguistics
Publication date
2024-03-15
Fields of study
Computer Science
Identifiers
DOI 10.48550/arXiv.2403.10351 arXiv 2403.10351
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

LLaMA: Open and Efficient Foundation Language Models
2023cited by this paper
Exploring the Limits of ChatGPT for Query or Aspect-based Text Summarization
2023cited by this paper
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation
2023cited by this paper
On Learning to Summarize with Large Language Models as References
2023cited by this paper
AnyPredict: Foundation Model for Tabular Prediction
2023cited by this paper
TrueTeacher: Learning Factual Consistency Evaluation with Large Language Models
2023cited by this paper
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
2023cited by this paper
PaLM: Scaling Language Modeling with Pathways
2022cited by this paper
Distilling Reasoning Capabilities into Smaller Language Models
2022cited by this paper
Large Language Models Are Reasoning Teachers
2022cited by this paper
PINTO: Faithful Language Reasoning Using Prompt-Generated Rationales
2022cited by this paper
BRIO: Bringing Order to Abstractive Summarization
2022cited by this paper
Attention Temperature Matters in Abstractive Summarization Distillation
2021cited by this paper
GLM: General Language Model Pretraining with Autoregressive Blank Infilling
2021cited by this paper
Pre-training a BERT with Curriculum Learning by Increasing Block-Size of Input Text
2021cited by this paper
Sequence Level Contrastive Learning for Text Summarization
2021cited by this paper
Want To Reduce Labeling Cost? GPT-3 Can Help
2021cited by this paper
SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization
2021cited by this paper
Knowledge Distillation on Extractive Summarization
2020cited by this paper
ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning
2020cited by this paper
Language Models are Few-Shot Learners
2020influential reference
Curriculum Learning for Natural Language Understanding
2020cited by this paper
PLATO-2: Towards Building an Open-Domain Chatbot via Curriculum Learning
2020cited by this paper
Big Bird: Transformers for Longer Sequences
2020cited by this paper
GSum: A General Framework for Guided Neural Abstractive Summarization
2020cited by this paper
Pre-trained Summarization Distillation
2020cited by this paper
Dublin, Ireland
2020cited by this paper
PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization
2019cited by this paper
Text Summarization with Pretrained Encoders
2019cited by this paper
Distilling Knowledge Learned in BERT for Text Generation
2019cited by this paper
Energy and Policy Considerations for Deep Learning in NLP
2019cited by this paper
On The Power of Curriculum Learning in Training Deep Networks
2019cited by this paper
Distilling Task-Specific Knowledge from BERT into Simple Neural Networks
2019cited by this paper
End-to-End Open-Domain Question Answering with BERTserini
2019cited by this paper
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
2019influential reference
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
2019cited by this paper
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
2019cited by this paper
TinyBERT: Distilling BERT for Natural Language Understanding
2019cited by this paper
Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization
2018influential reference
Towards A Rigorous Science of Interpretable Machine Learning
2017cited by this paper
Attention is All you Need
2017cited by this paper
Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond
2016influential reference
“Why Should I Trust You?”: Explaining the Predictions of Any Classifier
2016cited by this paper
Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks
2015cited by this paper
Distilling the Knowledge in a Neural Network
2015cited by this paper
Latent Dirichlet Allocation
2009cited by this paper
Curriculum learning
2009cited by this paper
Modeling Annotators: A Generative Approach to Learning from Annotator Rationales
2008cited by this paper
Large Language Models in Machine Translation
2007cited by this paper
Introduction to the Special Issue on Summarization
2002cited by this paper
Linguistics
1999cited by this paper

CITED BY

A comprehensive survey on automatic text summarization with exploration of LLM-based methods
2025cites this paper
A hybrid model for extractive summarization: Leveraging graph entropy to improve large language model performance
2025cites this paper
A survey on biomedical automatic text summarization with large language models
2025cites this paper
Natural Language Processing in Support of Evidence-based Medicine: A Scoping Review
2025cites this paper
HPEGPrSumm: a transformer-based text summarization with prompt tuning
2025cites this paper
Trustworthy Summarization via Uncertainty Quantification and Risk Awareness in Large Language Models
2025cites this paper
Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval
2024cites this paper
Just What You Desire: Constrained Timeline Summarization with Self-Reflection for Enhanced Relevance
2024cites this paper
A Comprehensive Survey on Process-Oriented Automatic Text Summarization with Exploration of LLM-Based Methods
2024cites this paper
How Private are Language Models in Abstractive Summarization?
2024cites this paper
ReasoningRank: Teaching Student Models to Rank through Reasoning-Based Knowledge Distillation
2024cites this paper
A Systematic Survey of Text Summarization: From Statistical Methods to Large Language Models
2024cites this paper
Factual Dialogue Summarization via Learning from Large Language Models
2024cites this paper
Panacea: A foundation model for clinical trial search, summarization, design, and recruitment
2024cites this paper
Explainability Meets Text Summarization: A Survey
2024cites this paper
Reason-to-Rank: Learning to Rank through Reasoning-Based Knowledge Distillation
year unknowncites this paper