Generative Content Models for Structural Analysis of Medical Abstracts

Jimmy J. Lin,Damianos G. Karakos,Dina Demner-Fushman,S. Khudanpur

Published 2006 in BioNLP@NAACL-HLT

ABSTRACT

The ability to accurately model the content structure of text is important for many natural language processing applications. This paper describes experiments with generative models for analyzing the discourse structure of medical abstracts, which generally follow the pattern of "introduction", "methods", "results", and "conclusions". We demonstrate that Hidden Markov Models are capable of accurately capturing the structure of such texts, and can achieve classification accuracy comparable to that of discriminative techniques. In addition, generative approaches provide advantages that may make them preferable to discriminative techniques such as Support Vector Machines under certain conditions. Our work makes two contributions: at the application level, we report good performance on an interesting task in an important domain; more generally, our results contribute to an ongoing discussion regarding the tradeoffs between generative and discriminative techniques.

PUBLICATION RECORD

Publication year
2006
Venue
BioNLP@NAACL-HLT
Publication date
2006-06-08
Fields of study
Medicine, Computer Science
Identifiers
DOI 10.3115/1567619.1567631
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Using argumentation to retrieve articles with similar citations: An inquiry into improving related articles search in the MEDLINE digital library
2006cited by this paper
Zone analysis in biology articles as a basis for information extraction
2006cited by this paper
Semi-Automatic Indexing of Full Text Biomedical Articles
2005cited by this paper
Knowledge Extraction for Clinical Question Answering: Preliminary Results
2005cited by this paper
The Role of Title, Metadata and Abstract in Identifying Clinically Relevant Journal Articles
2005cited by this paper
Development of the 2003 CU-HTK conversational telephone speech transcription system
2004cited by this paper
Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization
2004influential reference
Report on the TREC 2004 Experiment: Genomics Track
2004cited by this paper
Leveraging a common representation for personalized search and summarization in a medical digital library
2003cited by this paper
Report on the TREC 2003 Experiment: Genomic Track
2003cited by this paper
Categorization of Sentence Types in Medical Abstracts
2003cited by this paper
The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text
2003cited by this paper
An Unsupervised Approach to Recognizing Discourse Relations
2002cited by this paper
Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program
2001cited by this paper
On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes
2001cited by this paper
Patterns in Scientific Abstracts
2001cited by this paper
What’s Yours and What’s Mine: Determining Intellectual Attribution in Scientific Text
2000cited by this paper
Text Categorization with Support Vector Machines: Learning with Many Relevant Features
1999cited by this paper
Modern Applied Statistics with S-Plus.
1996cited by this paper
The HTK book
1995cited by this paper
Can primary care physicians' questions be answered using the medical journal literature?
1994cited by this paper
The Unified Medical Language System
1993cited by this paper
Genre Analysis: English in Academic and Research Settings
1990cited by this paper
A proposal for more informative abstracts of clinical articles. Ad Hoc Working Group for Critical Appraisal of the Medical Literature.
1987cited by this paper
Information needs in office practice: are they being met?
1985cited by this paper
Text generation: using discourse strategies and focus constraints to generate natural language text
1985cited by this paper

CITED BY

Coupling Local Context and Global Semantic Prototypes via a Hierarchical Architecture for Rhetorical Roles Labeling
2026cites this paper
Enhancing abstractive summarization of scientific papers using structure information
2024cites this paper
Boundary-Aware Dual Biaffine Model for Sequential Sentence Classification in Biomedical Documents
2024cites this paper
LSTM-based Deep Neural Network With A Focus on Sentence Representation for Sequential Sentence Classification in Medical Scientific Abstracts
2024cites this paper
Multi-label Sequential Sentence Classification via Large Language Model
2024cites this paper
Towards precise PICO extraction from abstracts of randomized controlled trials using a section-specific learning approach
2023cites this paper
Sectioning biomedical abstracts using pointer networks
2023cites this paper
A Taxonomy of Academic Abstract Sentence Classification A Taxonomy of Academic Abstract Sentence Classification Modelling Modelling
2022cites this paper
Sectioning of Biomedical Abstracts: A Sequence of Sequence Classification Task
2022cites this paper
A review on method framework construction of Chinese Information Science
2022cites this paper
A Span-based Dynamic Local Attention Model for Sequential Sentence Classification
2021cites this paper
Lifetime Achievement Award
2021cites this paper
Research Method Classification with Deep Transfer Learning for Semi-Automatic Meta-Analysis of Information Systems Papers
2021cites this paper
A hybrid approach to recognize generic sections in scholarly documents
2021cites this paper
Kathy McKeown Interviews Bonnie Webber
2021cites this paper
CODA-19: Using a Non-Expert Crowd to Annotate Research Aspects on 10,000+ Abstracts in the COVID-19 Open Research Dataset
2020cites this paper
Sequential Span Classification with Neural Semi-Markov CRFs for Biomedical Abstracts
2020cites this paper
Feature engineering vs. deep learning for paper section identification: Toward applications in Chinese medical literature
2020cites this paper
Knowledge fusion through academic articles: a survey of definitions, techniques, applications and challenges
2020cites this paper
A Hierarchical Model with Recurrent Convolutional Neural Networks for Sequential Sentence Classification
2019cites this paper
Emerald 110k: A Multidisciplinary Dataset for Abstract Sentence Classification
2019cites this paper
Fast and scalable neural embedding models for biomedical sentence classification
2018cites this paper
Context-aware Argument Mining and Its Applications in Education
2018cites this paper
Automatic Labeling of Problem-Solving Dialogues for Computational Microgenetic Learning Analytics
2018cites this paper
Low Resource Methods for Medieval Document Sections Analysis
2018cites this paper
Hierarchical Neural Networks for Sequential Sentence Classification in Medical Scientific Abstracts
2018cites this paper
Information extraction from scientific articles: a survey
2018cites this paper
Sequential short-text classification with neural networks
2017cites this paper
Section-wise indexing and retrieval of research articles
2017cites this paper
Unsupervised Trained Functional Discourse Parser for e-Learning Materials Scaffolding
2016cites this paper
Advanced natural language processing and temporal mining for clinical discovery
2016cites this paper
Neural Networks for Joint Sentence Classification in Medical Paper Abstracts
2016cites this paper
Structuralizing biomedical abstracts with discriminative linguistic features
2016cites this paper
Unsupervised discovery of information structure in biomedical documents
2015cites this paper
Recognition and Evaluation of Clinical Section Headings in Clinical Documents Using Token-Based Formulation with Conditional Random Fields
2015cites this paper
Constructing Synthesized Sheets by Mining Scientific Research Papers: Application to the Biological Domain
2015cites this paper
Unsupervised Declarative Knowledge Induction for Constraint-Based Learning of Information Structure in Scientific Documents
2015cites this paper
A Fuzzy Approach Model for Uncovering Hidden Latent Semantic Structure in Medical Text Collections
2015cites this paper
Struggling to retain the functions of passive when translating English thesis abstracts
2015cites this paper
Model dan Metoda Arsitektur pada Sistem Tanya Jawab Medis
2015influential citation
FLATM: A fuzzy logic approach topic model for medical documents
2015cites this paper
Towards a molecules production from DNA sequences based on clustering by 3D cellular automata approach and n-grams technique
2015cites this paper
On the Discoursive Structure of Computer Graphics Research Papers
2015cites this paper
Automatic identification of document sections for designing a French clinical corpus (Identification automatique de zones dans des documents pour la constitution d’un corpus médical en français) [in French]
2014cites this paper
Scholar metadata and knowledge generation with human and artificial intelligence
2014cites this paper
Citation sentence identification and classification for related work summarization
2014cites this paper
A Multiclass-based Classification Strategy Sentence Categorization
2014influential citation
An interactive metadata model for structural, descriptive, and referential representation of scholarly output
2014influential citation
Exploiting domain knowledge for cross-domain text classification in heterogeneous data sources
2014cites this paper
學術論文簡介的自動文步分析與寫作提示 (Automatic Move Analysis of Research Articles for Assisting Writing) [In Chinese]
2014cites this paper
GeneRIF indexing: sentence selection based on machine learning
2013cites this paper
Using the Argumentative Structure of Scientific Literature to Improve Information Access
2013cites this paper
Chapter 16: Text Mining for Translational Bioinformatics
2013cites this paper
Improved Information Structure Analysis of Scientific Documents Through Discourse and Lexical Constraints
2013cites this paper
Discourse structure and language
2013cites this paper
A Multiclass-based Classification Strategy for Rethorical Sentence Categorization from Scientific Papers
2013influential citation
Mining methodologies from NLP publications: A case study in automatic terminology recognition
2012cites this paper
Statistical Section Segmentation in Free-Text Clinical Records
2012cites this paper
Discourse Structure and Computation: Past, Present and Future
2012cites this paper
Automatic recognition of conceptualization zones in scientific articles and two life science applications
2012cites this paper
Rhetorical Move Detection in English Abstracts: Multi-label Sentence Classifiers and their Annotated Corpora
2012cites this paper
Document and Corpus Level Inference For Unsupervised and Transductive Learning of Information Structure of Scientific Documents
2012influential citation
A Weakly-supervised Approach to Argumentative Zoning of Scientific Documents
2011influential citation
Discourse Structures and Language Technologies
2011cites this paper
Discourse structure and language technology
2011cites this paper
Robustness of DNA-Based Clustering
2011cites this paper
Weakly supervised learning of information structure of scientific abstracts - is it accurate enough to benefit real-world tasks in biomedicine?
2011cites this paper
A comparison and user-based evaluation of models of textual information structure in the context of cancer risk assessment
2011cites this paper
Invited Paper: Discourse Structures and Language Technologies
2011cites this paper
BioNLP 2010: Year in review
2010cites this paper
Section classification in clinical notes using supervised hidden markov model
2010cites this paper
ExaCT: automatic extraction of clinical trial characteristics from journal publications
2010cites this paper
Biomedical question answering: A survey
2010cites this paper
Identifying the Information Structure of Scientific Abstracts: An Investigation of Three Different Schemes
2010cites this paper
Psychiatric document retrieval using a discourse-aware model
2009cites this paper
Bioinformatics
2009cites this paper
Robust Argumentative Zoning for Sensemaking in Scholarly Documents
2009cites this paper
Automatically Classifying Sentences in Full-Text Biomedical Articles into Introduction, Methods, Results and Discussion
2009cites this paper
Using conditional random fields for result identification in biomedical abstracts
2009cites this paper
Automatically classifying sentences in full-text biomedical articles into Introduction, Methods, Results and Discussion
2009cites this paper
Result identification for biomedical abstracts using Conditional Random Fields
2008cites this paper
DNA approach to solve clustering problem based on a mutual order
2008cites this paper
Identifying Sections in Scientific Abstracts using Conditional Random Fields
2008cites this paper
An ergonomic format for short reporting in scientific journals using nested tables and the Deming's cycle
2008cites this paper
Extracting Semantics in a Clinical Scenario
2007cites this paper
Answering Clinical Questions with Knowledge-Based and Statistical Techniques
2007cites this paper
Development, implementation, and a cognitive evaluation of a definitional question answering system for physicians
2007cites this paper
A Study of Structured Clinical Abstracts and the Semantic Classification of Sentences
2007influential citation
Theoretical and practical advances in genome halving
2004cites this paper
Natural Language Engineering Discourse Structure and Language Technology Discourse Structure and Language Technology
year unknowncites this paper
Bmc Medical Informatics and Decision Making Sentence Retrieval for Abstracts of Randomized Controlled Trials
year unknowncites this paper