Improving Data Driven Wordclass Tagging by System Combination

H. Halteren,Jakub Zavrel,Walter Daelemans

Published 1998 in Annual Meeting of the Association for Computational Linguistics

ABSTRACT

In this paper we examine how the differences in modelling between different data driven systems performing the same NLP task can be exploited to yield a higher accuracy than the best individual system. We do this by means of an experiment involving the task of morpho-syntactic wordclass tagging. Four well-known tagger generator (Hidden Markov Model, Memory-Based, Transformation Rules and Maximum Entropy) are trained on the same corpus data. After comparison, their outputs are combined using several voting strategies and second stage classifiers. All combination taggers outperform their best component, with the best combination showing a 19.1% lower error rate than the best indvidual tagger.

PUBLICATION RECORD

Publication year
1998
Venue
Annual Meeting of the Association for Computational Linguistics
Publication date
1998-07-31
Fields of study
Computer Science
Identifiers
DOI 10.3115/980845.980928 arXiv cmp-lg/9807013
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger
2000cited by this paper
Book Reviews: Syntactic Wordclass Tagging
2000cited by this paper
MBT: A Memory-Based Part of Speech Tagger-Generator
1996cited by this paper
A Maximum Entropy Model for Part-Of-Speech Tagging
1996cited by this paper
Error Correlation and Error Reduction in Ensemble Classifiers
1996cited by this paper
Comparison of tagging strategies, a prelude to democratic tagging
1996cited by this paper
C4.5: Programs for Machine Learning (書評)
1995cited by this paper
A Comparative Evaluation of Voting and Meta-learning on Partitioned Data
1995cited by this paper
Some Advances in Transformation-Based Part of Speech Tagging
1994cited by this paper
Programs for Machine Learning
1994cited by this paper
A Simple Rule-Based Part of Speech Tagger
1992cited by this paper
Stacked generalization
1992cited by this paper
C4.5: Programs for Machine Learning
1992cited by this paper
The tagged LOB Corpus : user's manual
1986cited by this paper

CITED BY

Parts-of-speech tagging of Nepali texts with Bidirectional LSTM, Conditional Random Fields and HMM
2023cites this paper
Detecting Objectifying Language in Online Professor Reviews
2020cites this paper
Battery state of charge estimation based on multi-model fusion
2019cites this paper
Intent Detection for Spoken Language Understanding Using a Deep Ensemble Model
2018cites this paper
EnsembleForest : A Classifier Combination Method on the example of Part-of-Speech Tagging
2017influential citation
A study on safety accidents of children's products based on stacking framework
2017cites this paper
Combining Lexical and Syntactic Features for Detecting Content-dense Texts in News
2017cites this paper
Malay Part of Speech Tagger: A Comparative Study on Tagging Tools
2015influential citation
Content selection in multi-document summarization
2015cites this paper
Application of a POS Tagger to a Novel Chronological Division of Early Modern German Text
2015cites this paper
Cross-domain polarity classification using a knowledge-enhanced meta-classifier
2015cites this paper
Novel harmony search-based algorithms for part-of-speech tagging
2014cites this paper
Intelligent Combination of Structural Analysis Algorithms: Application to Mathematical Expression Recognition
2014cites this paper
Hybrid PoS-tagging: A cooperation of evolutionary and statistical approaches
2014cites this paper
Brill Tagging using the Micron Automata Processor
2014cites this paper
Design and Implementation of an Automatic Scoring Model Using a Voting Method for Descriptive Answers
2013cites this paper
A comparative study of classifier combination applied to NLP tasks
2013cites this paper
Cooperation of evolutionary and statistical PoS-tagging
2012cites this paper
An Adaptive Framework for Named Entity Combination
2012cites this paper
Machine transliteration survey
2011cites this paper
A Comparative Study of Classifier Combination Methods Applied to NLP Tasks
2011cites this paper
Semantics and Relativity Expansion Based on Tag Recommendation with Time Degradation
2011cites this paper
Application of Weighted Voting Taggers to Languages Described with Large Tagsets
2010cites this paper
Towards Robust Multi-Tool Tagging. An OWL/DL-Based Approach
2010cites this paper
Hierarchical Phrase-Based Translation with Weighted Finite-State Transducers and Shallow-
2010cites this paper
Classifier Combination Systems and their Application in Human Language Technology
2010cites this paper
Integrating Parsing and Word Alignment in Syntax-Based Machine Translation
2010cites this paper
Supertagging: Using Complex Lexical Descriptions in Natural Language Processing
2010cites this paper
La Combinación de Sistemas y el PLN
2010cites this paper
Improving Hierarchical Document Signature Performance by Classifier Combination
2010cites this paper
Algorithms and Data Design Issues for Basic NLP Tools
2009cites this paper
Name Matching between Roman and Chinese Scripts: Machine Complements Human
2009influential citation
Name Matching between Roman and Chinese Scripts: Machine Complements Human
2009cites this paper
Automating the conversion of natural language fiction to multi-modal 3D animated virtual environments
2009cites this paper
High Accuracy Tagging with Large Tagsets
2008cites this paper
Machine transliteration of proper names between English and Persian
2008cites this paper
'n Woordsoortetiketteerder vir Afrikaans
2008cites this paper
Accuracy of Baseline and Complex Methods Applied to Morphosyntactic Tagging of Polish
2008cites this paper
Evaluating and improving morpho-syntactic classification over multiple corpora using pre-trained, "off-the-shelf", parts-of-speech tagging tools
2008cites this paper
Coping With Alternate Formulations Of Questions And Answers
2008cites this paper
Tagging with Combined Language Models and Large Tagsets
2008cites this paper
Machine Transliteration of Proper Names between
2008cites this paper
Applying System Combination to Base Noun Phrase Identi cationErik
2007cites this paper
Recent Advances in Memory-based Part-of-speech Tagging
2007cites this paper
Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University
2007cites this paper
A Case Study of Algorithms for Morphosyntactic Tagging of Polish Language
2007cites this paper
Automatic acquisition of semantic classes for adjectives
2007cites this paper
ii Preface
2007cites this paper
Charting Democracy Across Parsers
2007cites this paper
Modelling Polysemy in Adjective Classes by Multi-Label Classification
2007cites this paper
Statistical POS tagging experiments on Persian text
2007cites this paper
Minority Vote: At-Least-N Voting Improves Recall for Extracting Relations
2006cites this paper
An Empirical Study of Chinese Chunking
2006cites this paper
Vocabulary Alignment via Basic Level Concepts
2006cites this paper
MID-YEAR PROGRESS REPORT Candidate : Kevin
2006cites this paper
Part-of-Speech Tagging of Transcribed Speech
2006influential citation
Automatically Inducing a Part-of-Speech Tagger by Projecting from Multiple Source Languages Across Aligned Corpora
2005cites this paper
Voting Between Multiple Data Representations for Text Chunking
2005cites this paper
Extending the corpus of contemporary Arabic
2005cites this paper
Segmenting documents by stylistic character
2005cites this paper
Improving dependency analysis by syntactic parser combination
2005influential citation
Evaluating parts-of-speech taggers for use in a text-to-scene conversion system
2005cites this paper
From distributional to semantic similarity
2004cites this paper
Improving part-of-speech tagging using lexicalized HMMs
2004influential citation
Fusionner pour mieux analyser : Conception et évaluation de la plate-forme de combinaison
2004cites this paper
Acquiring Causal Knowledge from Text Using Connective Markers
2004cites this paper
SVMTool: A general POS Tagger Generator Based on Support Vector Machines
2004cites this paper
VOTING BETWEEN MULTIPLE DATA REPRESENTATIONS FOR TEXT CHUNKING by Hong Shen
2004cites this paper
POS Tagging of Hungarian with Combined Statistical and Rule-Based Methods
2004cites this paper
Syntactic parser combination for improved dependency analysis
2004cites this paper
Retrieving NASA problem reports: a case study in natural language information retrieval
2004cites this paper
Fast and Accurate Part{of{speech Tagging: the Svm Approach Revisited
2003cites this paper
Matrix : a statistical method and software tool for linguistic analysis through corpus comparison
2003cites this paper
Using decision trees to learn lexical information in a linguistics-based NLP system
2003cites this paper
A SNoW Based Supertagger with Application to NP Chunking
2003cites this paper
Chinese Word Segmentation as LMR Tagging
2003cites this paper
Segmenting a document by stylistic character
2003cites this paper
Performance Analysis of a Part of Speech Tagging Task
2003influential citation
Improving part-of-speech tagging using lexicalized HMMs
2003cites this paper
Impact of imperfect OCR on part-of-speech tagging
2003cites this paper
Combining POS-taggers for improved accuracy on Swedish text
2003cites this paper
Unsupervised Italian Word Sense Disambiguation using WordNets and Unlabeled Corpora
2002cites this paper
Learning with Multiple Stacking for Named Entity Recognition
2002cites this paper
DEREKO ( DEutsches REferenzKOrpus ) German Reference Corpus Final Report ( Part I )
2002cites this paper
Combining Outputs of Multiple Japanese Named Entity Chunkers by Stacking
2002influential citation
Recognising Clauses Using Symbolic and Machine Learning Approaches
2002cites this paper
Efficient Stochastic Part-of-Speech Tagging for Hungarian
2002cites this paper
Towards efficient statistical parsing using lexicalized grammatical information
2002cites this paper
Annotating Topological Fields and Chunks - and Revising POS Tags at the Same Time
2002cites this paper
Combining Classifiers for word sense disambiguation
2002cites this paper
Ensemble Methods for Automatic Thesaurus Extraction
2002cites this paper
Modeling Consensus: Classifier Combination for Word Sense Disambiguation
2002influential citation
A Hybrid Architecture for Robust Parsing of German
2002cites this paper
The Unknown Word Problem: a Morphological Analysis of Japanese Using Maximum Entropy Aided by a Dictionary
2001cites this paper
Improving Accuracy in word class tagging through the Combination of Machine Learning Systems
2001cites this paper
The John Hopkins SENSEVAL-2 System Descriptions
2001cites this paper
Learning Computational Grammars
2001cites this paper
Scaling to Very Very Large Corpora for Natural Language Disambiguation
2001cites this paper
Mapping Lexical Entries in a Verbs Database to WordNet Senses
2001cites this paper
Committee-based Decision Making in Probabilistic Partial Parsing
2000cites this paper