The advent of large language models (LLMs) has significantly advanced natural language processing tasks like text summarization. However, their large size and computational demands, coupled with privacy concerns in data transmission, limit their use in resource-constrained and privacy-centric settings. To overcome this, we introduce TriSum, a framework for distilling LLMs’ text summarization abilities into a compact, local model. Initially, LLMs extract a set of aspect-triple rationales and summaries, which are refined using a dual-scoring method for quality. Next, a smaller local model is trained with these tasks, employing a curriculum learning strategy that evolves from simple to complex tasks. Our method enhances local model performance on various benchmarks (CNN/DailyMail, XSum, and ClinicalTrial), outperforming baselines by 4.5%, 8.5%, and 7.4%, respectively. It also improves interpretability by providing insights into the summarization rationale.
TriSum: Learning Summarization Ability from Large Language Models with Structured Rationale
Pengcheng Jiang,Cao Xiao,Zifeng Wang,Parminder Bhatia,Jimeng Sun,Jiawei Han
Published 2024 in North American Chapter of the Association for Computational Linguistics
ABSTRACT
PUBLICATION RECORD
- Publication year
2024
- Venue
North American Chapter of the Association for Computational Linguistics
- Publication date
2024-03-15
- Fields of study
Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-51 of 51 references · Page 1 of 1
CITED BY
Showing 1-16 of 16 citing papers · Page 1 of 1