Toward General-Purpose Learning for Information Extraction

Published 1998 in Annual Meeting of the Association for Computational Linguistics

ABSTRACT

Two trends are evident in the recent evolution of the field of information extraction: a preference for simple, often corpus-driven techniques over linguistically sophisticated ones; and a broadening of the central problem definition to include many non-traditional text domains. This development calls for information extraction systems which are as retargetable and general as possible. Here, we describe SRV, a learning architecture for information extraction which is designed for maximum generality and flexibility. SRV can exploit domain-specific information, including linguistic syntax and lexical information, in the form of features provided to the system explicitly as input for training. This process is illustrated using a domain created from Reuters corporate acquisitions articles. Features are derived from two general-purpose NLP systems, Sleator and Temperly's link grammar parser and Wordnet. Experiments compare the learner's performance with and without such linguistic information. Surprisingly, in many cases, the system performs as well without this information as with it.

PUBLICATION RECORD

Publication year
1998
Venue
Annual Meeting of the Association for Computational Linguistics
Publication date
1998-08-10
Fields of study
Computer Science
Identifiers
DOI 10.3115/980845.980914
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Relational Learning of Pattern-Match Rules for Information Extraction
1999cited by this paper
Information Extraction from HTML: Application of a General Machine Learning Approach
1998influential reference
Using grammatical inference to improve precision in information extraction
1997cited by this paper
Information extraction
1996cited by this paper
Automatically Generating Extraction Patterns from Untagged Text
1996cited by this paper
Learning text analysis rules for domain-specific natural language processing
1996influential reference
Parsing English with a Link Grammar
1995cited by this paper
WordNet: A Lexical Database for English
1995influential reference
Automatically Acquiring Conceptual Patterns without an Annotated Corpus
1995cited by this paper
Wrap-Up: a Trainable Discourse Module for Information Extraction
1994cited by this paper
FASTUS: A Finite-state Processor for Information Extraction from Real-world Text
1993cited by this paper
Representation and Learning in Information Retrieval
1991cited by this paper
Modeling and artificial intelligence
1991cited by this paper
Representation And Learning
1988cited by this paper

CITED BY

Research on an Event Extraction Framework Based on Two-Step Prompt Learning for Chinese Policy
2025cites this paper
The State of Relation Extraction Data Quality: Is Bigger Always Better?
2024cites this paper
Accelerating Human Authorship of Information Extraction Rules
2022cites this paper
TDJEE: A Document-Level Joint Model for Financial Event Extraction
2021cites this paper
What is Event Knowledge Graph: A Survey
2021cites this paper
Dictionary Structure Identification
2021cites this paper
Named entity recognition of legal judgment based on small-scale labeled data
2020cites this paper
Extracting a Knowledge Base of Mechanisms from COVID-19 Papers
2020cites this paper
A Retrospective on Mutual Bootstrapping
2018cites this paper
Open-Schema Event Profiling for Massive News Corpora
2018cites this paper
A Semi-automatic and low-cost method to learn patterns for named entity recognition*
2017cites this paper
Evaluating Automatic Learning of Structure for Event Extraction
2017cites this paper
RBPB: Regularization-Based Pattern Balancing Method for Event Extraction
2016cites this paper
Numerical Relation Extraction with Minimal Supervision
2016cites this paper
Distinguishing Past, On-going, and Future Events: The EventStatus Corpus
2016cites this paper
ExtractIon BasEd on dEEp syntactIc-sEmantIc analysIs
2016cites this paper
Declarative Cleaning of Inconsistencies in Information Extraction
2016cites this paper
Information Extraction Based on Deep Syntactic-Semantic Analysis
2016cites this paper
Document Spanners
2015cites this paper
Scalable Semantic Parsing with Partial Ontologies
2015cites this paper
Leveraging Pattern Semantics for Extracting Entities in Enterprises
2015cites this paper
Information Extraction Grammars
2015cites this paper
A Hybrid Approach to General Information Extraction
2015cites this paper
Moving towards the semantic web: enabling new technologies through the semantic annotation of social contents
2015cites this paper
Improving Information Extraction by Discourse-Guided and Multifaceted Event Recognition
2014cites this paper
Database principles in information extraction
2014cites this paper
Improved Pattern Learning for Bootstrapped Entity Extraction
2014cites this paper
Cleaning inconsistencies in information extraction via prioritized repairs
2014cites this paper
Winner-Takes-All based Multi-Strategy Learning for Information Extraction
2014influential citation
A Survey on Region Extractors from Web Documents
2013cites this paper
Experiential Knowledge Mining
2013cites this paper
Spanners: a formal framework for information extraction
2013cites this paper
Event Schema Induction with a Probabilistic Entity-Driven Model
2013cites this paper
Bootstrapped Training of Event Extraction Classifiers
2012cites this paper
Interactive Learning of Relation Extractors with Weak Supervision
2012cites this paper
Final Report A SVM Model for Relation Classification of Noun Phrases based on the NELL Database
2012cites this paper
Proficient Extraction and Management of Knowledge via Machine Intelligence
2012cites this paper
APPLICATION OF LINK GRAMMAR IN SEMI-SUPERVISED NAMED ENTITY RECOGNITION FOR ACCIDENT DOMAIN
2011cites this paper
Template-Based Information Extraction without the Templates
2011cites this paper
Modelling Entity Instantiations
2011cites this paper
Finding best evidence for evidence-based best practice recommendations in health care: the initial decision support system design
2011cites this paper
Multi-Inductive Learning approach for Information Extraction
2011cites this paper
Peeling Back the Layers: Detecting Event Role Fillers in Secondary Contexts
2011cites this paper
An Introduction to the Sundance and AutoSlog Systems
2011cites this paper
Learning 5000 Relational Extractors
2010cites this paper
DAMASK: Deliverable 1
2010cites this paper
Name entity recognition using inductive logic programming
2010cites this paper
Document zoning for enhancing spatial and temporal understanding in Web-based health surveillance systems
2010cites this paper
Automating the conversion of natural language fiction to multi-modal 3D animated virtual environments
2009cites this paper
A Unified Model of Phrasal and Sentential Evidence for Information Extraction
2009cites this paper
Acquiring paraphrases from text corpora
2009cites this paper
Turning the Web into a Database: Extracting Data and Structure
2009cites this paper
Learning paraphrases from text
2009cites this paper
A framework for semantic web implementation based on context-oriented controlled automatic annotation.
2009cites this paper
Evaluation et détermination de la pertinence pour des syntagmes candidats à la collocation
2008cites this paper
Mining manufacturing databases to discover the effect of operation sequence on the product quality
2008cites this paper
Boosting text segmentation via progressive classification
2008cites this paper
An incrementally trainable statistical approach to information extraction based on token classification and rich context models
2008cites this paper
Pattern-based segmentation of digital documents: model and implementation
2008cites this paper
Negation recognition in medical narrative reports
2008cites this paper
Multilinguïsation des systèmes de e-commerce traitant des énoncés spontanés en langue naturelle. (Multilinguïsation of e-commerce system treating spontaneous utterances in natural language)
2008cites this paper
Using ILP to Construct Features for Information Extraction from Semi-structured Text
2007influential citation
Exploiting Role-Identifying Nouns and Expressions for Information Extraction
2007cites this paper
Effective Information Extraction with Semantic Affinity Patterns and Relevant Regions
2007cites this paper
Learning Recursive Patterns for Biomedical Information Extraction
2007cites this paper
Ontologies and Information Extraction
2006influential citation
Mining Information Extraction Models for HmtDB annotation
2006cites this paper
Automatic Sales Lead Generation from Web Data
2006cites this paper
Learning for Biomedical Information Extraction with ILP
2006cites this paper
Adaptive information extraction
2006cites this paper
Knowledge and Information Systems
2006cites this paper
Deux Principales Indissociables
2006cites this paper
Automated question answering: review of the main approaches
2005cites this paper
An Overview and Classification of Adaptive Approaches to Information Extraction
2005cites this paper
Nformation Extraction with Automatic Knowledge Expansion
2005cites this paper
Exploiting Subjectivity Classification to Improve Information Extraction
2005cites this paper
Algorithms for Minimum Risk Chunking
2005cites this paper
Mining knowledge from text using information extraction
2005cites this paper
Learning Language in Logic - Genic Interaction Extraction Challenge
2005cites this paper
Learning Semantic Parsers: An Important but Under-Studied Problem
2004cites this paper
A new method for automatic pattern acquisition to extract information from biomedical texts
2004cites this paper
Machine Learning for Information Extraction in Genomics — State of the Art and Perspectives
2004cites this paper
Active Learning Selection Strategies for Information Extraction
2003cites this paper
Rule-based learning algorithm for fact extraction
2003cites this paper
An Integrated System of Mining HTML Texts and Filtering Structured Documents
2003cites this paper
Learning Extraction Patterns for Subjective Expressions
2003cites this paper
Property-Based Feature Engineering and Selection
2002cites this paper
Property-Based Feature Engineering and Selection
2002influential citation
Inducing Information Extraction Systems for New Languages via Cross-language Projection
2002cites this paper
Learning for Semantic Interpretation: Scaling Up without Dumbing Down
2001influential citation
Machine Learning for Information Extraction
2001influential citation
Learning for Text Categorization and Information Extraction with ILP
2001cites this paper
Machine Learning and Natural Language Processing
2000cites this paper
Learning to construct knowledge bases from the World Wide Web
2000cites this paper
Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping
1999cites this paper
Knowledge Discovery in SportsFinder: An Agent to Extract Sports Results from the Web
1999cites this paper
Relational learning techniques for natural language information extraction
1998influential citation
Using HTML Formatting to Aid in Natural Language Processing on the World Wide Web
1998cites this paper
Information Extraction from HTML: Application of a General Machine Learning Approach
1998cites this paper
Learning to Extract Symbolic Knowledge from the World Wide Web
1998cites this paper