Error-Driven Pruning of Treebank Grammars for Base Noun Phrase Identification

Published 1998 in Annual Meeting of the Association for Computational Linguistics

ABSTRACT

Finding simple, non-recursive, base noun phrases is an important subtask for many natural language processing applications. While previous empirical methods for base NP identification have been rather complex, this paper instead proposes a very simple algorithm that is tailored to the relative simplicity of the task. In particular, we present a corpus-based approach for finding base NPs by matching part-of-speech tag sequences. The training phase of the algorithm is based on two successful techniques: first the base NP grammar is read from a "treebank" corpus; then the grammar is improved by selecting rules with high "benefit" scores. Using this simple algorithm with a naive heuristic for matching rules, we achieve surprising accuracy in an evaluation on the Penn Treebank Wall Street Journal.

PUBLICATION RECORD

Publication year
1998
Venue
Annual Meeting of the Association for Computational Linguistics
Publication date
1998-08-10
Fields of study
Computer Science
Identifiers
DOI 10.3115/980845.980881 arXiv cmp-lg/9808015
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Tree-Bank Grammars
1996cited by this paper
Speech Recognition by Composition of Weighted Finite Automata
1996cited by this paper
Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging
1995cited by this paper
Text Chunking using Transformation-Based Learning
1995cited by this paper
NPtool, a Detector of English Noun Phrases
1995cited by this paper
Technical terminology: some linguistic properties and an algorithm for identification in text
1995influential reference
Building a Large Annotated Corpus of English: The Penn Treebank
1993cited by this paper
Surface Grammatical Analysis for the Extraction of Terminological Noun Phrases
1992cited by this paper
A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text
1988cited by this paper
Performance structures: A psycholinguistic and linguistic appraisal☆
1983cited by this paper

CITED BY

Preparation of Sentiment tagged Parallel Corpus and Testing its effect on Machine Translation
2020cites this paper
Using genre-specific features for patent summaries
2017cites this paper
Exploration of an interdisciplinary scientific landscape
2017influential citation
Noun Phrase Chunking for Turkish Using a Dependency Parser
2015cites this paper
A Joint Framework for Coreference Resolution and Mention Head Detection
2015cites this paper
A RULE BASED NOUN PHRASE CHUNKER FOR TURKISH
2014cites this paper
Bengali noun phrase chunking based on conditional random fields
2014cites this paper
Naxi sentence similarity calculation based on improved chunking edit-distance
2014cites this paper
Turkish Constituent Chunking with Morphological and Contextual Features
2013cites this paper
Resolução de correferência em múltiplos documentos utilizando aprendizado não supervisionado
2011cites this paper
Linguistics parameters for zero anaphora resolution
2010cites this paper
Effects of Annotation Errors Searching for linguistic phenomena : The role of recall
2010cites this paper
On the Role of Lexical Features in Sequence Labeling
2009cites this paper
Highly accurate error-driven method for noun phrase detection
2008cites this paper
SVM Model Tampering and Anchored Learning: A Case Study in Hebrew NP Chunking
2007cites this paper
Creating an Appropriate Corpus for PP Attachment Training
2007cites this paper
Improving End-User Efficiency Using the Smart / Empire IR System
2007cites this paper
Identification of Noun Phrase with Various Granularities
2007cites this paper
Database Sele tion for Longer
2007cites this paper
Noun Phrase Chunking in Hebrew: Influence of Lexical and Morphological Features
2006cites this paper
A Set of NP-Extraction Rules for Portuguese: Defining, Learning and Pruning
2006cites this paper
Rough Set Based Approach to Base Noun Phrase Identification
2006cites this paper
A Hybrid Approach to Chinese Base Noun Phrase Chunking
2006cites this paper
Improving Chinese text Chunkings precision using Transformation-based Learning
2006cites this paper
Prune Diseased Branches to Get Healthy Trees ! How to Find Erroneous Local Trees in a Treebank and Why It Matters
2005influential citation
Voting Between Multiple Data Representations for Text Chunking
2005cites this paper
Research and realization of naive Bayes English text classification method based on base noun phrase identification
2005cites this paper
Extracting Partial Parsing Rules from Tree-Annotated Corpus: Toward Deterministic Global Parsing
2005cites this paper
Automatic Partial Parsing Rule Acquisition Using Decision Tree Induction
2005cites this paper
Application of Information Technology: Improved Identification of Noun Phrases in Clinical Radiology Reports Using a High-Performance Statistical Natural Language Parser Augmented with the UMLS Specialist Lexicon
2005cites this paper
Evolutionary algorithm for noun phrase detection in natural language processing
2005cites this paper
High precision English base noun phrase identification based on "waterfall" model
2005cites this paper
利用向量支撐機辨識中文基底名詞組的初步研究 (A Preliminary Study on Chinese Base NP Detection using SVM) [In Chinese]
2005cites this paper
Statistical Recognition of Noun Phrases in Unrestricted Text
2005cites this paper
TO NOUN PHRASE COREFERENCE RESOLUTION
2004cites this paper
Symbiosis of evolutionary techniques and statistical natural language processing
2004cites this paper
Podado y lexicalización de reglas gramaticales y su aplicación al análisis sintáctico parcial
2004cites this paper
Phrase Chunking for Efficient Parsing in Machine Translation System
2004cites this paper
Improving Machine Learning Approaches to Noun Phrase Coreference Resolution
2004cites this paper
VOTING BETWEEN MULTIPLE DATA REPRESENTATIONS FOR TEXT CHUNKING by Hong Shen
2004cites this paper
1 Base Noun Phrase Chunking with Support Vector Machines
2004influential citation
Inference with Classifiers : The Phrase Identification Problem Inference with Classifiers : The Phrase Identification Problem ∗
2004cites this paper
Database Selection for Longer Queries
2004cites this paper
Machine learning approaches for Chinese shallow parsers
2003cites this paper
An island-driven parsing system
2003cites this paper
Grammar learning by partition search
2002cites this paper
Shallow Parsing using Noisy and Non-Stationary Training Material
2002cites this paper
Intelligent Clustering with Instance-Level Constraints
2002cites this paper
Shallow Parsing with PoS Taggers and Linguistic Features
2002cites this paper
ITRI-02-09 Grammar Learning by Partition Search
2002cites this paper
Memory-Based Shallow Parsing
2002cites this paper
Experiments in learning models for functional chunking of Chinese text
2001cites this paper
Limitations of Co-Training for Natural Language Learning from Large Datasets
2001cites this paper
The Use of Classifiers in Sequential Inference
2001cites this paper
Shallow Parsing By Weighted Probabilistic Sum
2001cites this paper
Exploring evidence for shallow parsing
2001cites this paper
Optimisation of corpus-derived probabilistic grammars
2001influential citation
Chunking + Island-Driven Parsing = Full Parsing Alicia Ageno and Horacio
2001cites this paper
Noun phrase chunking with APL2
2000influential citation
Rule Writing or Annotation: Cost-efficient Resource Usage for Base Noun Phrase Chunking
2000cites this paper
Shallow Parsing as Part-of-Speech Tagging
2000cites this paper
Noun Phrase Recognition by System Combination
2000cites this paper
Machine Learning and Natural Language Processing
2000cites this paper
Tagging and Chunking with Bigrams
2000cites this paper
Shallow Parsing by Inferencing with Classi ers
2000cites this paper
Examining the Role of Statistical and Linguistic Knowledge Sources in a General-Knowledge Question-Answering System
2000cites this paper
Incorporating Compositional Evidence in Memory-Based Partial Parsing
2000cites this paper
Theory Refinement and Natural Language Learning
2000cites this paper
Shallow Parsing by Inferencing with Classifiers
2000cites this paper
A Unified Statistical Model for the Identification of English BaseNP
2000cites this paper
Memory-Based Shallow Parsing
1999cites this paper
Cascaded Markov Models
1999cites this paper
The Role of Lexicalization and Pruning for Base Noun Phrase Grammars
1999cites this paper
MDL-based DCG Induction for NP Identification
1999cites this paper
Tagging and parsing with cascaded Markov models: automation of corpus annotation
1999cites this paper
Ëùùññøøø Óö Èùùðð Blockinøøóò
1999cites this paper
Learning a lightweight robust deterministic parser
1999cites this paper
Noun Phrase Coreference as Clustering
1999cites this paper
Are phrase structured grammars useful in statistical parsing
1999cites this paper
Representing Text Chunks
1999cites this paper
Man vs. Machine: A Case Study in Base Noun Phrase Learning
1999cites this paper
A Learning Approach to Shallow Parsing
1999cites this paper
A Memory-Based Approach to Learning Shallow Natural Language Patterns
1998cites this paper
The Smart/Empire TIPSTER IR System
1998cites this paper