Maximum entropy techniques for exploiting syntactic, semantic and collocational dependencies in language modeling

Published 2000 in Computer Speech and Language

ABSTRACT

A new statistical language model is presented which combines collocational dependencies with two important sources of long-range statistical dependence: the syntactic structure and the topic of a sentence. These dependencies or constraints are integrated using the maximum entropy technique. Substantial improvements are demonstrated over a trigram model in both perplexity and speech recognition accuracy on the Switchboard task. A detailed analysis of the performance of this language model is provided in order to characterize the manner in which it performs better than a standard N -gram model. It is shown that topic dependencies are most useful in predicting words which are semantically related by the subject matter of the conversation. Syntactic dependencies on the other hand are found to be most helpful in positions where the best predictors of the following word are not within N -gram range due to an intervening phrase or clause. It is also shown that these two methods individually enhance an N -gram model in complementary ways and the overall improvement from their combination is nearly additive.

PUBLICATION RECORD

Publication year
2000
Venue
Computer Speech and Language
Publication date
2000-10-01
Fields of study
Linguistics, Computer Science
Identifiers
DOI 10.1006/csla.2000.0149
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Language Model Adaptation
1999cited by this paper
Recognition performance of a structured language model
1999influential reference
Combining nonlocal, syntactic and n-gram dependencies in language modeling.
1999cited by this paper
Dynamic Nonlocal Language Modeling via Hierarchical Topic-Based Adaptation
1999cited by this paper
A maximum entropy language model integrating N-grams and topic dependencies for conversational speech recognition
1999influential reference
Exploiting Syntactic Structure for Language Modeling
1998cited by this paper
Exploiting both local and global constraints for multi-span statistical language modeling
1998influential reference
Topic adaptation for language modeling using unnormalized exponential models
1998cited by this paper
Adaptive topic - dependent language modelling using word - based varigrams
1997cited by this paper
Language model adaptation using mixtures and an exponentially decaying cache
1997cited by this paper
Structure and performance of a dependency language model
1997cited by this paper
Language model adaptation using dynamic marginals
1997cited by this paper
Inducing Features of Random Fields
1995cited by this paper
Gibbs-markov Models
1995cited by this paper
Cluster Expansions and Iterative Scaling for Maximum Entropy Language Models
1995cited by this paper
Adaptive Statistical Language Modeling; A Maximum Entropy Approach
1994cited by this paper
SWITCHBOARD: telephone speech corpus for research and development
1992cited by this paper
Grammatical Trigrams: A Probabilistic Model of Link Grammar
1992cited by this paper
Why least squares and maximum entropy? An axiomatic approach to inference for linear inverse problems
1991cited by this paper
Maximum Entropy and Bayesian Methods
1989cited by this paper
On the rationale of maximum-entropy methods
1982cited by this paper
Generalized Iterative Scaling for Log-Linear Models
1972cited by this paper

CITED BY

LLMs and Compound AI Systems - Exploring Potential Applications in Real-World Scenarios
2025cites this paper
Recent Advances in Generative AI and Large Language Models: Current Status, Challenges, and Perspectives
2024cites this paper
Federated and edge learning for large language models
2024cites this paper
Why GANs are overkill for NLP
2022cites this paper
Incipient chatter fast and reliable detection method in high-speed milling process based on cumulative strategy.
2022cites this paper
Are GANs overkill for NLP?
2022cites this paper
When not to use an Adversarial Approach to Generative Modeling
2021cites this paper
Capturing Structural Locality in Non-parametric Language Models
2021cites this paper
MaxEnt feature-based reliability model method for real-time detection of early chatter in high-speed milling.
2020cites this paper
Integrating Discrete and Neural Features Via Mixed-Feature Trans-Dimensional Random Field Language Models
2020cites this paper
Feature Based Domain Adaptation for Neural Network Language Models with Factorised Hidden Layers
2019cites this paper
Learning Trans-Dimensional Random Fields with Applications to Language Modeling
2018cites this paper
A Decade of Discriminative Language Modeling 13 2 Features In DLMs
2018cites this paper
Language Modeling for Turkish Text and Speech Processing
2018cites this paper
Unbounded cache model for online language modeling with open vocabulary
2017cites this paper
Personalizing Recurrent-Neural-Network-Based Language Model by Social Network
2017cites this paper
Recurrent neural network language models for automatic speech recognition
2017influential citation
Unsupervised Adaptation of Recurrent Neural Network Language Models
2016cites this paper
Improving Neural Language Models with a Continuous Cache
2016cites this paper
Contextual Prediction Models for Speech Recognition
2016cites this paper
Improving continuous space language models auxiliary features
2015cites this paper
Integration of complex language models in ASR and LU systems
2015cites this paper
Trans-dimensional Random Fields for Language Modeling
2015cites this paper
Statistical Machine Translation of the Arabic Language
2015cites this paper
A Decade of Discriminative Language Modeling for Automatic Speech Recognition
2015cites this paper
Bag-of-words input for long history representation in neural network-based language models for speech recognition
2015cites this paper
Exceptions in language as learned by the multi-factor sparse plus low-rank language model
2013cites this paper
Candidate expansion algorithm based on weighted syllable confusion matrix for Mandarin LVCSR
2013cites this paper
Recurrent Neural Network Based Personalized Language Modeling by Social Network Crowdsourcing
2013cites this paper
Estimating Confusions in the ASR Channel for Improved Topic-based Language Model Adaptation
2013cites this paper
Recurrent neural network based language model personalization by social network crowdsourcing
2013cites this paper
Rank and sparsity in language processing
2013cites this paper
A Challenge Set for Advancing Language Modeling
2012cites this paper
A Scalable Distributed Syntactic, Semantic, and Lexical Language Model
2012cites this paper
Factored recurrent neural network language model in TED lecture transcription
2012cites this paper
Discriminative Language Modeling With Linguistic and Statistically Derived Features
2012cites this paper
Context dependent recurrent neural network language model
2012cites this paper
Continuous space models with neural networks in natural language processing. (Modèles neuronaux pour la modélisation statistique de la langue)
2012cites this paper
Efficient Structured Language Modeling for Speech Recognition
2012cites this paper
Factored Language Model based on Recurrent Neural Network
2012cites this paper
Speech Recognition error correction by using combinational measures
2012cites this paper
Proceedings of the NAACL-HLT 2012 Workshop: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT - Workshop Notes
2012cites this paper
Adaptation for YouTube Video Transcription
2012cites this paper
Practical and efficient incorporation of syntactic features into statistical language models
2012cites this paper
On-the-fly Topic Adaptation for YouTube Video Transcription
2012cites this paper
Fast Syntactic Analysis for Statistical Language Modeling via Substructure Sharing and Uptraining
2012influential citation
NAACL-HLT 2012 WLM 2012: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT Workshop Notes
2012cites this paper
Revisiting the Case for Explicit Syntactic Information in Language Models
2012cites this paper
Search and decoding strategies for complex lexical modeling in lvcsr
2011cites this paper
Efficient discriminative training of long-span language models
2011cites this paper
Efficient Subsampling for Training Complex Language Models
2011cites this paper
A Large Scale Distributed Syntactic, Semantic and Lexical Language Model for Machine Translation
2011cites this paper
Randomized maximum entropy language models
2011cites this paper
Hill climbing on speech lattices: A new rescoring framework
2011cites this paper
Unsupervised Discriminative Language Model Training for Machine Translation using Simulated Confusion Sets
2010cites this paper
Syntactic and sub-lexical features for Turkish discriminative language models
2010cites this paper
Discriminative training and variational decoding in machine translation via novel algorithms for weighted hypergraphs
2010cites this paper
Joint acoustic and language modeling for speech recognition
2010cites this paper
Turkish Broadcast News Transcription and Retrieval
2009cites this paper
Language Modeling for limited-data domains
2009cites this paper
Adaptive word prediction and its application in an assistive communication system
2009cites this paper
Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods
2009cites this paper
An error detection and correction framework to improve large vocabulary continuous speech recognition
2009cites this paper
Exploiting prosodic breaks in language modeling with random forests
2008cites this paper
Joint Morphological-Lexical Language Modeling for Processing Morphologically Rich Languages With Application to Dialectal Arabic
2008cites this paper
Accuracy improvement for a voice recognition using field association knowledge
2008cites this paper
Large-scale Discriminative n-gram Language Models for Statistical Machine Translation
2008cites this paper
Improving Speech Recognition and Understanding using Error-Corrective Reranking
2008cites this paper
Discriminative n-gram language modeling for Turkish
2008cites this paper
A Factored Language Model for Prosody Dependent Speech Recognition
2007cites this paper
Investigating linguistic knowledge in a maximum entropy token-based language model
2007cites this paper
Discriminative n-gram language modeling
2007cites this paper
Discriminative Models for Speech Recognition
2007cites this paper
CONVERSATION-BASED LANGUAGE MODELING USING A LOSS-SENSITIVE PERCEPTRON ALGORITHM
2007cites this paper
A Maximum Entropy Approach for Semantic Language Modeling
2006cites this paper
N-Best List Reranking using Higher Level Phonetic, Lexical, Syntactic and Semantic Knowledge Sources
2006cites this paper
Stochastic Analysis of Lexical and Semantic Enhanced Structural Language Model
2006cites this paper
A multi-pass error detection and correction framework for Mandarin LVCSR
2006cites this paper
Using Duration Information in Cantonese Connected-Digit Recognition
2006cites this paper
Maximum Entropy Modeling of Acoustic and Linguistic Features
2006cites this paper
A Multi-Pass Error Detection and C Mandarin LVC
2006cites this paper
Maximum Entropy Modeling: A Suitable Framework to Learn Context-Dependent Lexicon Models for Statistical Machine Translation
2005cites this paper
A genetic word clustering algorithm
2005cites this paper
Bootstrapping parsers via syntactic projection across parallel texts
2005cites this paper
Combining Statistical Language Models via the Latent Maximum Entropy Principle
2005cites this paper
Using semantic analysis to improve speech recognition performance
2005cites this paper
Combined maximum entropy language model using different feature sets
2005cites this paper
Log-Linear Models
2004cites this paper
Question Classification using Maximum Entropy Models
2004cites this paper
Discriminative Language Modeling with Conditional Random Fields and the Perceptron Algorithm
2004cites this paper
Error identification for large vocabulary speech recognition
2004cites this paper
A two-level schema for detecting recognition errors
2004cites this paper
Exploiting syntactic, semantic and lexical regularities in language modeling via directed Markov random fields
2004cites this paper
Semantic n-gram language modeling with the latent maximum entropy principle
2003cites this paper
Exploiting syntactic structure of queries in a language modeling approach to IR
2003cites this paper
Sibylle : système linguistique d'aide à la communication pour les personnes handicapées
2003cites this paper
Semantic structured language models
2002influential citation
Whole-sentence exponential language models: a vehicle for linguistic-statistical integration
2001cites this paper
A Simple Closed-Class/Open-Class Factorization for Improved Language Modeling
2001cites this paper
Statistical Modeling in Continuous Speech Recognition (CSR)
2001cites this paper