Exploring Content Models for Multi-Document Summarization

Published 2009 in North American Chapter of the Association for Computational Linguistics

ABSTRACT

We present an exploration of generative probabilistic models for multi-document summarization. Beginning with a simple word frequency based model (Nenkova and Vanderwende, 2005), we construct a sequence of models each injecting more structure into the representation of document set content and exhibiting ROUGE gains along the way. Our final model, HierSum, utilizes a hierarchical LDA-style model (Blei et al., 2004) to represent content specificity as a hierarchy of topic vocabulary distributions. At the task of producing generic DUC-style summaries, HierSum yields state-of-the-art ROUGE performance and in pairwise user evaluation strongly outperforms Toutanova et al. (2007)'s state-of-the-art discriminative system. We also explore HierSum's capacity to produce multiple 'topical summaries' in order to facilitate content discovery and navigation.

PUBLICATION RECORD

Publication year
2009
Venue
North American Chapter of the Association for Computational Linguistics
Publication date
2009-05-31
Fields of study
Computer Science
Identifiers
DOI 10.3115/1620754.1620807
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Latent Dirichlet Allocation
2009influential reference
Bayesian Unsupervised Topic Segmentation
2008cited by this paper
Topic-Driven Multi-Document Summarization with Encyclopedic Knowledge and Spreading Activation
2008cited by this paper
Topical N-Grams: Phrase and Topic Discovery, with an Application to Information Retrieval
2007cited by this paper
Generating a Table-of-Contents
2007cited by this paper
A Study of Global Inference Algorithms in Multi-document Summarization
2007cited by this paper
The PYTHY Summarization System: Microsoft Research at DUC 2007
2007cited by this paper
Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion
2007influential reference
Satisfying information needs with multi-document summaries
2007cited by this paper
Improved Affinity Graph Based Multi-Document Summarization
2006cited by this paper
Bayesian Query-Focused Summarization
2006cited by this paper
An Information-Theoretic Approach to Automatic Evaluation of Summaries
2006cited by this paper
Impact of Linguistic Analysis on the Semantic Graph Coverage and Learning of Document Extracts
2005cited by this paper
The Impact of Frequency on Summarization
2005cited by this paper
LexRank: Graph-based Centrality as Salience in Text Summarization
2004cited by this paper
LexRank: Graph-based Lexical Centrality as Salience in Text Summarization
2004cited by this paper
Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization
2004cited by this paper
ROUGE: A Package for Automatic Evaluation of Summaries
2004influential reference
iNeATS: Interactive Multi-Document Summarization
2003cited by this paper
Hierarchical Topic Models and the Nested Chinese Restaurant Process
2003cited by this paper
Probabilistic Text Structuring: Experiments with Sentence Ordering
2003cited by this paper
Towards Multidocument Summarization by Reformulation: Progress and Prospects
1999cited by this paper
The Automatic Creation of Literature Abstracts
1958cited by this paper

CITED BY

Small language models applied in text summarization task of health-related news to improve public health audit: an experimental case study.
2026cites this paper
Legal text summarization with optimized hybrid models and fine-tuned LLaMA-2
2026cites this paper
A Comparative Study of Extractive Summarization Techniques
2025cites this paper
Optimizing Multi-Document Summarization via Discrete Bat Algorithm: A Nature-Inspired Approach for Enhanced Text
2025cites this paper
Topic-Guided Reinforcement Learning with LLMs for Enhancing Multi-Document Summarization
2025cites this paper
BOE-XSUM: Extreme Summarization in Clear Language of Spanish Legal Decrees and Notifications
2025cites this paper
Effective Neural Author-Topic Modeling by Leveraging Pre-Trained Language Models
2025cites this paper
A Novel Approach for Graph-based Extractive Text Summarization using Karcı Dominant Set Algorithm and Eigenvector Centrality
2025cites this paper
Enhancing Text Summarization with AI: A Multi-Agent System and Human Comparison in Educational Contexts
2025cites this paper
Extractive summarization of Indian legal judgements using unsupervised role labelling
2025cites this paper
Comparative Evaluation of Extractive Summarization Algorithms on Multi-Topic News Articles in Portuguese
2025cites this paper
Leveraging large language models to enhance clustering-based topic modeling
2025cites this paper
A comprehensive survey on automatic text summarization with exploration of LLM-based methods
2025cites this paper
From Extraction to Reasoning: A Systematic Review of Algorithms in Multi-Document Summarization and QA
2025cites this paper
Leveraging large language models for abstractive summarization of Italian legal news
2025cites this paper
Streamlining Video Summarization with NLP: Techniques, Implementation, and Future Direction
2025cites this paper
GenSumm: A Joint Framework for Multi-Task Tweet Classification and Summarization Using Sentiment Analysis and Generative Modelling
2024cites this paper
An integer linear programming model for multi document summarization of learning materials using phrase embedding technique
2024cites this paper
A Mixed-Language Multi-Document News Summarization Dataset and a Graphs-Based Extract-Generate Model
2024influential citation
Stacked Denoising Variational Auto Encoder Model for Extractive Web Text Summarization
2024influential citation
Deeper Investigation on Extractive-Abstractive Summarization for Indonesian Text
2024cites this paper
A Comprehensive Survey on Process-Oriented Automatic Text Summarization with Exploration of LLM-Based Methods
2024cites this paper
Topic Modeling: A Consistent Framework for Comparative Studies
2024cites this paper
MEQA: A Benchmark for Multi-hop Event-centric Question Answering with Explanations
2024cites this paper
BioMDSum: An Effective Hybrid Biomedical Multi-Document Summarization Method Based on PageRank and Longformer Encoder-Decoder
2024cites this paper
TOMDS (Topic-Oriented Multi-Document Summarization): Enabling Personalized Customization of Multi-Document Summaries
2024cites this paper
COVID-19 Research Trends in Social Work: LDA Topic Modeling Analysis in South Korea
2024cites this paper
Comparison of Extractive and Abstractive Approaches in Automatic Text Summarization: An Evaluation on BBC-News and PubMed Datasets
2024cites this paper
HintMiner: Automatic Question Hints Mining From Q&A Web Posts with Language Model via Self-Supervised Learning
2024cites this paper
An indicator-based multi-objective variable neighborhood search approach for query-focused summarization
2024cites this paper
Extracting semantic link network of words from text for semantics-based applications
2024cites this paper
Comprehensive Review of Automatic Text Summarization Techniques
2024influential citation
An experimental study of game theory with various word embeddings for automatic extractive text summarization
2024cites this paper
Multi-Document Summarization Using Selective Attention Span and Reinforcement Learning
2023cites this paper
Can Anaphora Resolution Improve Extractive Query-Focused Multi-Document Summarization?
2023cites this paper
An Extraction Aggregation Strategy Using a Hierarchical Representation Model Based on Longformers and Transform
2023cites this paper
Unsupervised Multi-document Summarization with Holistic Inference
2023cites this paper
PRODIGy: a PROfile-based DIalogue Generation dataset
2023cites this paper
A novel multi document summarization with document-elements augmentation for learning materials using concept based ILP and clustering methods
2023cites this paper
Summarization of Lengthy Legal Documents via Abstractive Dataset Building: An Extract-then-Assign Approach
2023cites this paper
Adapt-to-Learn Policy Network for Abstractive Multi-document Summarization
2023cites this paper
APCS: Towards Argument Based Pros and Cons Summarization of Peer Reviews
2023cites this paper
Abstractive Text Summarization Based on Long-Short Transformer
2023cites this paper
Reordered Exponential Golomb Error Correction Code for Universal Near-Capacity Joint Source-Channel Coding
2023cites this paper
A Personalized Reinforcement Learning Summarization Service for Learning Structure from Unstructured Data
2023cites this paper
Natural Language Processing Applied in the Context of Economic Defense: A Case Study in a Brazilian Federal Public Administration Agency
2023cites this paper
MyOracle: A Question-Answering application to improve access to information for low-literacy users
2023cites this paper
Sequence labeling with MLTA: Multi-level topic-aware mechanism
2023cites this paper
Bridging Natural Language Processing and Psycholinguistics: computationally grounded semantic similarity datasets for Basque and Spanish
2023cites this paper
DeepMetaGen: an unsupervised deep neural approach to generate template-based meta-reviews leveraging on aspect category and sentiment analysis from peer reviews
2023cites this paper
Graph-enhanced multi-answer summarization under question-driven guidance
2023cites this paper
Bridging Natural Language Processing and Psycholinguistics: computationally grounded semantic similarity and relatedness datasets for Basque and Spanish
2023cites this paper
A sentence is known by the company it keeps: Improving Legal Document Summarization Using Deep Clustering
2023cites this paper
A Comprehensive Review on Automatic Text Summarization
2023cites this paper
A comprehensive review of automatic text summarization techniques: method, data, evaluation and coding
2023influential citation
Bayesian Optimization based Score Fusion of Linguistic Approaches for Improving Legal Document Summarization
2023cites this paper
CovSumm: an unsupervised transformer-cum-graph-based hybrid document summarization model for CORD-19
2023cites this paper
OASum: Large-Scale Open Domain Aspect-based Summarization
2022cites this paper
AI-based learning content generation and learning pathway augmentation to increase learner engagement
2022cites this paper
Controlled Text Reduction
2022cites this paper
GAE-ISUMM: Unsupervised Graph-based Summarization for Indian Languages
2022cites this paper
Interactive Query-Assisted Summarization via Deep Reinforcement Learning
2022cites this paper
Tiered sentence based topic model for multi-document summarization
2022cites this paper
The Right to Remain Plain: Summarization and Simplification of Legal Documents
2022cites this paper
Engineering Document Summarization: A Bidirectional Language Model-Based Approach
2022cites this paper
A multi-objective memetic algorithm for query-oriented text summarization: Medicine texts as a case study
2022cites this paper
Generating extractive sentiment summaries for natural language user queries on products
2022cites this paper
Template-Based Headline Generator for Multiple Documents
2022cites this paper
Revealing the Reflections of the Pandemic by Investigating COVID-19 Related News Articles Using Machine Learning and Network Analysis
2022cites this paper
Topic modeling algorithms and applications: A survey
2022cites this paper
Automatic Related Work Generation: A Meta Study
2022influential citation
An Unsupervised Masking Objective for Abstractive Multi-Document News Summarization
2022cites this paper
Web Page Classification Based on an Accurate Technique for Key Data Extraction
2022cites this paper
Unsupervised Query-Focused Multi-document Summarization Using uSIF Sentence Embedding Model and Maximal Marginal Relevance Criterion
2022cites this paper
A Hybrid Approach to Cross-lingual Product Review Summarization
2022cites this paper
Improving Kullback-Leibler based legal document summarization using enhanced text representation
2022cites this paper
Privacy Pitfalls of Online Service Terms and Conditions: a Hybrid Approach for Classification and Summarization
2022cites this paper
Implementation of Data Analysis and Document Summarization in Social Media Data Using R and Python
2022cites this paper
Comparing Methods for Extractive Summarization of Call Centre Dialogue
2022cites this paper
Transfer Learning for Sequence Generation: from Single-source to Multi-source
2021cites this paper
Extending Multi-Document Summarization Evaluation to the Interactive Setting
2021cites this paper
Improving Reader Motivation with Machine Learning
2021cites this paper
Unsupervised extractive multi-document summarization method based on transfer learning from BERT multi-task fine-tuning
2021cites this paper
Improving Online Forums Summarization via Hierarchical Unified Deep Neural Network
2021cites this paper
See, Hear, Read: Leveraging Multimodality with Guided Attention for Abstractive Text Summarization
2021cites this paper
An investigation of linguistic problems in automatic multi-document summaries / Uma investigação de problemas linguísticos em sumários automáticos multidocumento
2021cites this paper
Generating Related Work
2021cites this paper
A Sliding-Window Approach to Automatic Creation of Meeting Minutes
2021cites this paper
Automatic Summarization of Legal Bills: A Comparative Analysis of Classical Extractive Approaches
2021cites this paper
CAWESumm: A Contextual and Anonymous Walk Embedding Based Extractive Summarization of Legal Bills
2021cites this paper
Wikipedia Current Events Summarization using Particle Swarm Optimization
2021cites this paper
Error Analysis of using BART for Multi-Document Summarization: A Study for English and German Language
2021cites this paper
Improving Online Forums Summarization via Unifying Hierarchical Attention Networks with Convolutional Neural Networks
2021cites this paper
Unsupervised query-focused multi-document summarization based on transfer learning from sentence embedding models, BM25 model, and maximal marginal relevance criterion
2021cites this paper
On Semi-Automatic Creation of Dataset for Multi-Document Automatic Summarization of News Articles an Forum Threads
2021cites this paper
Abstractive Multi-document Summarization using Topical Simplicial Curves
2021cites this paper
Live blog summarization
2021cites this paper
Unsupervised Abstractive Summarization of Bengali Text Documents
2021cites this paper
DP-BERT: Dynamic Programming BERT for Text Summarization
2021cites this paper
An online multi-source summarization algorithm for text readability in topic-based search
2021cites this paper