A Hybrid Approach to Adaptive Statistical Language Modeling

Published 1994 in Human Language Technology - The Baltic Perspectiv

ABSTRACT

We describe our latest attempt at adaptive language modeling. At the heart of our approach is a Maximum Entropy (ME) model, which incorporates many knowledge sources in a consistent manner. The other components are a selective unigram cache, a conditional bigram cache, and a conventional static trigram. We describe the knowledge sources used to build such a model with ARPA's official WSJ corpus, and report on perplexity and word error rate results obtained with it. Then, three different adaptation paradigms are discussed, and an additional experiment, based on AP wire data, is used to compare them.

PUBLICATION RECORD

Publication year
1994
Venue
Human Language Technology - The Baltic Perspectiv
Publication date
1994-03-08
Fields of study
Computer Science
Identifiers
DOI 10.3115/1075812.1075827
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

1993 Benchmark Tests for the ARPA Spoken Language Program
1994cited by this paper
Improving speech recognition performance via phone-dependent VQ codebooks and adaptive language models in SPHINX-II
1994cited by this paper
Adaptive Statistical Language Modeling; A Maximum Entropy Approach
1994cited by this paper
The Hub and Spoke Paradigm for CSR Evaluation
1994cited by this paper
An Overview of the SPHINX-II Speech Recognition System
1993cited by this paper
Adaptive Language Modeling Using the Maximum Entropy Principle
1993cited by this paper
The SPHINX-II speech recognition system: an overview
1993cited by this paper
Trigger-based language models: a maximum entropy approach
1993cited by this paper
Improvements in Stochastic Language Modeling
1992cited by this paper
A tree-based statistical language model for natural language speech recognition
1989cited by this paper
Speech Recognition and the Frequency of Recently Used Words: A Modified Markov Model for Natural Language
1988cited by this paper
Generalized Iterative Scaling for Log-Linear Models
1972cited by this paper
Information Theory and Statistical Mechanics
1957cited by this paper

CITED BY

Learning natural ordering of tags in domain-specific Q&A sites
2021cites this paper
Multi-timescale representation learning in LSTM Language Models
2020cites this paper
Automatic speech recognition: a survey
2020cites this paper
Developing an Adaptive Language Model for Bahasa Indonesia
2019cites this paper
Detecting Document-level Context Triggers to Resolve Translation Ambiguity
2015cites this paper
Learning Constructions of Natural Language: Statistical Models and Evaluations
2012cites this paper
Word prediction techniques for user adaptation and sparse data mitigation
2011influential citation
Word Prediction Techniques for User Adaptation and Sparse Data
2008cites this paper
Utilisation de la linguistique en reconnaissance de la parole : un état de l'art
2006cites this paper
Abbreviation Recognition with MaxEnt Model
2006cites this paper
Period Disambiguation with Maxent Model
2005cites this paper
Modeling long distance dependence in language: topic mixtures versus dynamic cache models
1999cites this paper
MAXIMUM LIKELIHOOD ESTIMATION OF MULTIPLE PRONUNCIATIONS FOR PROPER NOUNS
1999cites this paper
Dynamic Nonlocal Language Modeling via Hierarchical Topic-Based Adaptation
1999influential citation
EFFICIENT SEARCH ALGORITHMS FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
1998cites this paper
Category-Based Statistical Language Models
1997cites this paper
Methods of Category Classiication Applied to Word-sense Disambiguation and Discourse Analysis a Proposal Technical Information
1996cites this paper
A maximum entropy approach to adaptive statistical language modelling
1996influential citation
Incorporating diverse information sources in handwriting recognition postprocessing
1996cites this paper
Modeling long distance dependence in language: topic mixtures vs. dynamic cache models
1996cites this paper
Building Probabilistic Models for Natural Language
1996influential citation
A review of large-vocabulary continuous-speech
1996cites this paper
Automated speech understanding: the next generation
1995cites this paper
Algorithms for bigram and trigram word clustering
1995cites this paper
Quantifying lexical influence: Giving direction to context
1995cites this paper
Methodologies for language modeling and search in continuous speech recognition
1995cites this paper
Training and Application of Integrated Grammar/Bigram Language Models
1994cites this paper
Word-Sense Disambiguation Using Decomposable Models
1994cites this paper
Word-Sense Disambiguation Using Decomposable Models
1994cites this paper
Analyzing and Improving Statistical Language Models for Speech Recognition
1994cites this paper
Adaptive Statistical Language Modeling; A Maximum Entropy Approach
1994cites this paper
Improving Language Models by Clustering Training Sentences
1994cites this paper
1993 Benchmark Tests for the ARPA Spoken Language Program
1994cites this paper
Boston University College of Engineering Thesis Language Modeling with Sentence-level Mixtures
year unknowncites this paper