Using the Web as an Implicit Training Set: Application to Structural Ambiguity Resolution

Published 2005 in Human Language Technology - The Baltic Perspectiv

ABSTRACT

Recent work has shown that very large corpora can act as training data for NLP algorithms even without explicit labels. In this paper we show how the use of surface features and paraphrases in queries against search engines can be used to infer labels for structural ambiguity resolution tasks. Using unsupervised algorithms, we achieve 84% precision on PP-attachment and 80% on noun compound coordination.

PUBLICATION RECORD

Publication year
2005
Venue
Human Language Technology - The Baltic Perspectiv
Publication date
2005-10-06
Fields of study
Linguistics, Computer Science
Identifiers
DOI 10.3115/1220575.1220680
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Search Engine Statistics Beyond the n-Gram: Application to Noun Compound Bracketing
2005cited by this paper
Web-based models for natural language processing
2005cited by this paper
Using a Distributional Thesaurus to Resolve Coordination Ambiguities
2005cited by this paper
The Web as a Baseline: Evaluating the Performance of Unsupervised Web-based Models for a Range of NLP Tasks
2004cited by this paper
Learning random walk models for inducing word dependency distributions
2004influential reference
Improving Prepositional Phrase Attachment Disambiguation Using the Web as Corpus
2003cited by this paper
Bracketing Compound Nouns for Logic Form Derivation
2002cited by this paper
Exploiting the WWW as a corpus to resolve PP attachment ambiguities
2001cited by this paper
Scaling to Very Very Large Corpora for Natural Language Disambiguation
2001cited by this paper
Scaling up. Using the WWW to Resolve PP Attachment Ambiguities
2000cited by this paper
An Unsupervised Approach to Prepositional Phrase Attachment using Contextually Similar Words
2000influential reference
Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language
1999cited by this paper
An Unsupervised Model for Statistically Determining Coordinate Phrase Attachment
1999cited by this paper
Statistical Models for Unsupervised Prepositional Phrase Attachment
1998cited by this paper
Three Generative, Lexicalised Models for Statistical Parsing
1997cited by this paper
Corpus Based PP Attachment Ambiguity Resolution with a Semantic Dictionary
1997influential reference
Prepositional Phrase Attachment through a Backed-off Model
1995influential reference
A Maximum Entropy Model for Prepositional Phrase Attachment
1994influential reference
A Rule-Based Approach to Prepositional Phrase Attachment Disambiguation
1994cited by this paper
Building a Large Annotated Corpus of English: The Penn Treebank
1993influential reference
Selection and information: a class-based approach to lexical relationships
1993cited by this paper
A Simple but Useful Approach to Conjunct Identification
1992cited by this paper
Dynamic Programming Method for Analyzing Conjunctive Structures in Japanese
1992influential reference
Structural Ambiguity and Lexical Relations
1991influential reference
Coping with Syntactic Ambiguity or How to Put the Block in the Box on the Table
1982cited by this paper
Statistical Methods for Rates and Proportions, 2nd ed
1981cited by this paper
Statistical methods for rates and proportions
1973influential reference

CITED BY

Using Domain-Specific Corpora for Improved Handling of Ambiguity in Requirements
2021cites this paper
Using Word Sketches to Resolve Prepositional Phrase Attachment Ambiguity in Arabic
2019cites this paper
Noun–Noun Compound Analysis: A Holistic Perspective
2019influential citation
Ontology-Based Ambiguity Resolution of Manufacturing Text for Formal Rule Extraction
2019cites this paper
Prediction of Mathematical Expression Declarations based on Spatial, Semantic, and Syntactic Analysis
2019cites this paper
Using Brain Imaging to Gauge Difficulties in Processing Ambiguous Text by Non-native Speakers
2019cites this paper
Using the Web as an Implicit Training Set: Application to Noun Compound Syntax and Semantics
2019cites this paper
An Analysis of Prepositional-Phrase Attachment Disambiguation
2018cites this paper
Improving Syntactic Parsing of Clinical Text Using Domain Knowledge
2017cites this paper
Toward Solving Penn Treebank Parsing
2017cites this paper
Pattern-based methods for Improved Lexical Semantics and Word Embeddings
2017cites this paper
Learning to Rank for Coordination Detection
2017cites this paper
Integrating Selectional Constraints and Subcategorization Frames in a Dependency Parser
2016influential citation
Symmetric Patterns and Coordinations: Fast and Enhanced Representations of Verbs and Adjectives
2016cites this paper
Parsing Paraphrases with Joint Inference
2015cites this paper
Evaluating Parsers with Dependency Constraints
2015influential citation
Painless Labeling with Application to Text Mining
2015cites this paper
Web-scale Surface and Syntactic n-gram Features for Dependency Parsing
2015cites this paper
Disambiguating prepositional phrase attachment sites with sense information captured in contextualized distributional data
2014cites this paper
A HYBRID METHOD OF LINGUISTIC APPROACH AND STATISTICAL METHOD FOR NESTED NOUN COMPOUND EXTRACTION
2014cites this paper
Web as a Corpus: Going Beyond the n-gram
2014cites this paper
Comparison between Rongorongo and the syllable sequence of ancient chants from the Easter Island
2013cites this paper
On the interpretation of noun compounds: Syntax, semantics, and entailment
2013influential citation
Surface Web Semantics for Structured Natural Language Processing
2013influential citation
Improving term extraction with linguistic analysis in the biomedical domain
2013cites this paper
Discriminative Learning with Natural Annotations: Word Segmentation as a Case Study
2013cites this paper
Annotating Signs of Syntactic Complexity to Support Sentence Simplification
2013cites this paper
A Neurocomputational Approach to Prepositional Phrase Attachment Ambiguity Resolution
2012cites this paper
Semi-Supervised Noun Compound Analysis with Edge and Span Features
2012influential citation
UNIVERSITY OF ALGARVE FACULTY OF SOCIAL AND HUMAN SCIENCE THE UNIVERSITY OF WOLVERHAMPTON SCHOOL OF LAW, SOCIAL SCIENCES AND COMMUNICATIONS
2012cites this paper
Extracting Unambiguous Keywords from Microposts using Web and Query Logs Data
2012cites this paper
Annotating Coordination in the Penn Treebank
2012cites this paper
Semi-supervised Dependency Parsing using Lexical Affinities
2012cites this paper
Domain Adaptation of a Dependency Parser with a Class-Class Selectional Preference Model
2012cites this paper
Coordination Structure Analysis using Dual Decomposition
2012cites this paper
présentée à l'Université d'Avignon et des Pays de Vaucluse pour obtenir le diplôme de DOCTORAT
2011cites this paper
Web-Scale Features for Full-Scale Parsing
2011influential citation
The Acquisition Of Lexical Knowledge From The Web For Aspects Of Semantic Interpretation
2011cites this paper
IRILD: An Information Retrieval Based Method for Information Leak Detection
2011cites this paper
Comparing methods for the syntactic simplification of sentences in information extraction
2011cites this paper
Modèles de langage ad hoc pour la reconnaissance automatique de la parole. (Ad-hoc language models for automatic speech recognition)
2011cites this paper
Generative Modeling of Coordination by Factoring Parallelism and Selectional Preferences
2011cites this paper
Splitting Noun Compounds via Monolingual and Bilingual Paraphrasing: A Study on Japanese Katakana Words
2011cites this paper
Using Large Monolingual and Bilingual Corpora to Improve Coordination Disambiguation
2011influential citation
Coordination resolution in biomedical texts
2011cites this paper
Trawling the Deep Web
2011cites this paper
A Methodology for Automatic Identification of Nocuous Ambiguity
2010cites this paper
Improving Syntactic Coordination Resolution using Language Modeling
2010cites this paper
Bypassed Alignment Graphs for Detection and Scope Disambiguation of Japanese Coordinate Phrases
2010cites this paper
Automatic detection of nocuous coordination ambiguities in natural language requirements
2010cites this paper
Coordination Analysis Using Global Structural Constraints and Alignment-based Local Features
2010cites this paper
Large-scale semi-supervised learning for natural language processing
2010cites this paper
The Web as a Privacy Lab
2010cites this paper
Coordinate Structure Analysis with Global Structural Constraints and Alignment-Based Local Features
2009cites this paper
Using Lexical Patterns in the Google Web 1T Corpus to Deduce Semantic Relations Between Nouns
2009cites this paper
Mapping Verbal Argument Preferences to Deverbals
2009cites this paper
A psycholinguistic model of natural language parsing implemented in simulated neurons
2009cites this paper
Improving classification accuracy using automatically extracted training data
2009cites this paper
Examining the Use of Region Web Counts for ESL Error Detection
2009cites this paper
Large-Scale Syntactic Processing : Parsing the Web Final Report of the 2009 JHU CLSP Workshop
2009cites this paper
Prepositional phrase attachment ambiguity resolution using semantic hierarchies
2009influential citation
Towards identifying intervention arms in randomized controlled trials: Extracting coordinating constructions
2009cites this paper
Conquering Language: Using NLP on a Massive Scale to Build High Dimensional Language Models from the Web
2009cites this paper
Biomedical Text Mining Based on Machine Learning: from Information Extraction to Coordination Identification
2008influential citation
Automatic Identification of Nocuous Ambiguity
2008cites this paper
Acquisition automatique de traductions d'unités lexicales complexes à partir du Web. (Automatic Web acquisition of complex lexical units translations)
2008cites this paper
Prepositional phrase attachment ambiguity resolution using word sense hierarchies Kailash Nadh
2008influential citation
The Open University ’ s repository of research publications and other research outputs Automatic identification of nocuous ambiguity
2008cites this paper
Coordination Disambiguation without Any Similarities
2008cites this paper
Are Morpho-Syntactic Features More Predictive for the Resolution of Noun Phrase Coordination Ambiguity than Lexico-Semantic Similarity Scores?
2008cites this paper
Detecting privacy leaks using corpus-based association rules
2008cites this paper
Coordinate Noun Phrase Disambiguation in a Generative Parsing Model
2007cites this paper
Web-Based Inference Detection
2007cites this paper
Learning Noun Phrase Query Segmentation
2007influential citation
Semantic Classification of Noun Phrases Using Web Counts and Learning Algorithms
2007cites this paper
A Discriminative Learning Model for Coordinate Conjunctions
2007cites this paper
UCD-PN: Classification of Semantic Relations Between Nominals using WordNet and Web Counts
2007cites this paper
Open Information Extraction from the Web
2007cites this paper
Resolution of Coordination Ellipses in Biological Named Entities Using Conditional Random Fields
2007cites this paper
Information extraction from unstructured web text
2007cites this paper
Identifying Nocuous Ambiguities in Natural Language Requirements
2006cites this paper
Using Verbs to Characterize Noun-Noun Relations
2006cites this paper
Toward Practical Spoken Language Translation
2005cites this paper
Towards a Discourse Relation-aware Approach for Chinese-english Machine Translation a Mapping-based Approach for General Formal Human Computer Interaction Using Natural Language Learning Grammar with Explicit Annotations for Subordinating Conjunctions Multi-document Summarization Using Distortion-ra
year unknowncites this paper
Prepositional-Phrase Attachment Disambiguation Using Derived Semantic Information and Large External Corpora
year unknowncites this paper