Similarity-Based Estimation of Word Cooccurrence Probabilities

Ido Dagan,Fernando C Pereira,Lillian Lee

Published 1994 in Annual Meeting of the Association for Computational Linguistics

ABSTRACT

In many applications of natural language processing it is necessary to determine the likelihood of a given word combination. For example, a speech recognizer may need to determine which of the two word combinations "eat a peach" and "eat a beach" is more likely. Statistical NLP methods determine the likelihood of a word combination according to its frequency in a training corpus. However, the nature of language is such that many word combinations are infrequent and do not occur in a given corpus. In this work we propose a method for estimating the probability of such previously unseen word combinations using available information on "most similar" words.We describe a probabilistic word association model based on distributional word similarity, and apply it to improving probability estimates for unseen word bigrams in a variant of Katz's back-off model. The similarity-based method yields a 20% perplexity improvement in the prediction of unseen bigrams and statistically significant reductions in speech-recognition error.

PUBLICATION RECORD

Publication year
1994
Venue
Annual Meeting of the Association for Computational Linguistics
Publication date
1994-05-02
Fields of study
Computer Science
Identifiers
DOI 10.3115/981732.981770 arXiv cmp-lg/9405001
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Smoothing of Automatically Generated Selectional Constraints
1993cited by this paper
Towards History-based Grammars: Using Richer Models for Probabilistic Parsing
1993cited by this paper
Contextual Word Similarity and Estimation From Sparse Data
1993cited by this paper
Improvements in Stochastic Language Modeling
1992cited by this paper
Class-Based n-gram Models of Natural Language
1992cited by this paper
Stochastic Lexicalized Tree-adjoining Grammars
1992cited by this paper
Grammatical Trigrams: A Probabilistic Model of Link Grammar
1992cited by this paper
Cooccurrence smoothing for stochastic language modeling
1992cited by this paper
Experience with a Stack Decoder-Based HMM CSR and Back-Off N-Gram Language Models
1991cited by this paper
A comparison of the enhanced Good-Turing and deleted estimation methods for estimating probabilities of English bigrams
1991cited by this paper
Estimation of probabilities from sparse data for the language model component of a speech recognizer
1987cited by this paper
Isolated word recognition using hidden Markov models
1985cited by this paper
THE POPULATION FREQUENCIES OF SPECIES AND THE ESTIMATION OF POPULATION PARAMETERS
1953cited by this paper

CITED BY

Inference Helps PLMs’ Conceptual Understanding: Improving the Abstract Inference Ability with Hierarchical Conceptual Entailment Graphs
2024cites this paper
GPU-based Private Information Retrieval for On-Device Machine Learning Inference
2023cites this paper
From unified phrase representation to bilingual phrase alignment in an unsupervised manner
2022cites this paper
Estimating word co-occurrence probabilities from pretrained static embeddings using a log-bilinear model
2022cites this paper
Consensus Knowledge Graph Learning via Multi-view Sparse Low Rank Block Model
2022cites this paper
Fast Extraction of Word Embedding from Q-contexts
2021cites this paper
Graph Fusion Network for Text Classification
2021cites this paper
Network representation learning: A macro and micro view
2021cites this paper
From static to dynamic word representations: a survey
2020cites this paper
Context-theoretic Semantics for Natural Language: an Algebraic Framework
2020cites this paper
Four Problems with Extracting Human Semantics from Large Text Corpora
2019cites this paper
Integrating learned and explicit document features for reputation monitoring in social media
2019cites this paper
Neural Network Methods for Natural Language Processing
2017cites this paper
Improving Semantic Composition with Offset Inference
2017cites this paper
Improving Sparse Word Representations with Distributional Inference for Semantic Composition
2016cites this paper
ProMine: A Text Mining Solution for Concept Extraction and Filtering
2016cites this paper
Aligning Packed Dependency Trees: A Theory of Composition for Distributional Semantics
2016cites this paper
Word representation using a deep neural network
2016cites this paper
The mechanism of additive composition
2015cites this paper
Discovering missing me edges across social networks
2015cites this paper
Probabilistic topic modeling in multilingual settings: An overview of its methodology and applications
2015cites this paper
Unsupervised Algorithms for Cross-Lingual Text Analysis, Translation Mining, and Information Retrieval (Algoritmen voor ongesuperviseerde cross-linguale tekstanalyse, het identificeren van vertalingen en informatieontsluiting)
2014cites this paper
Unsupervised Algorithms for Cross-Lingual Text Analysis, Translation Mining, and Information Retrieval
2014cites this paper
Linguistic Regularities in Sparse and Explicit Word Representations
2014cites this paper
A generic framework and methodology for extracting semantics from co-occurrences
2014cites this paper
Neural Word Embedding as Implicit Matrix Factorization
2014cites this paper
Efficient large-context dependency parsing and correction with distributional lexical resources. (Analyse syntaxique probabiliste en de'pendances : approches efficaces à large contexte avec ressources lexicales distributionnelles)
2013cites this paper
Selective impairment of adjective order constraints as overeager abstraction: An elaboration on Kemmerer et al. (2009)
2013cites this paper
Asymmetric Distributional Similarity Measures to Recognize Textual Entailment by Generality. (Mesures de similarité distributionnelle asymétrique pour la détection de l'implication textuelle par généralité)
2013cites this paper
Numerical Algorithms for the Analysis of Expert Opinions Elicited in Text Format
2013cites this paper
Selective Impairment of Adjective Order Constraints as Overeager Abstraction: an Elaboration on Kemmerer Et Al. (2009)
2012cites this paper
Distributional Measures of Semantic Distance: A Survey
2012cites this paper
Discovering Links among Social Networks
2012cites this paper
Distributional Measures as Proxies for Semantic Relatedness
2012influential citation
Opinion mining: reviewed from word to document level
2012cites this paper
Continuous space models with neural networks in natural language processing. (Modèles neuronaux pour la modélisation statistique de la langue)
2012cites this paper
Discovering Hidden me Edges in a Social Internetworking Scenario
2012cites this paper
Term Validation for Vocabulary Construction and Key Term Extraction
2011cites this paper
Using Various Features in Machine Learning to Obtain High Levels of Performance for Recognition of Japanese Notational Variants
2010cites this paper
Query reformulation using anchor text
2010cites this paper
Distributional Clustering of Words for Text Classi cation
2010cites this paper
A Bayesian Method for Robust Estimation of Distributional Similarities
2010cites this paper
A Probabilistic Model of Semantic Plausibility in Sentence Processing
2009cites this paper
A survey on sentiment detection of reviews
2009cites this paper
Vector-based Ranking Techniques for Identifying the Topical Anchors of a Context
2009cites this paper
Context-based Quasi-Synonym Extraction
2009influential citation
Unsupervised Type and Token Identification of Idiomatic Expressions
2009cites this paper
Automatic Summary Evaluation without Human Models
2008cites this paper
Studying the History of Ideas Using Topic Models
2008cites this paper
Acquisition automatique de sens pour la désambiguïsation et la sélection lexicale en traduction
2008cites this paper
The values of meaning and the meanings of 'values': environmental language in text and concept system in a Wet Tropics World Heritage context
2008cites this paper
On the use of natural language processing for automated conceptual data modeling
2008cites this paper
Óóòòøøú Ë Blockin Blockin Blockinò
2008cites this paper
Automatic acquisition for sensibility knowledge using co-occurrence relation
2008cites this paper
Exploiting Keyword Co-occurrence and Citations for Query Generation
2008cites this paper
Measuring Semantic Distance using Distributional Profiles of Concepts
2008cites this paper
Estimating Sparse Events using Probabilistic Logic: Application to Word n-Grams
2007cites this paper
AUTOMATIC ACQUISITION OF LEXICAL KNOWLEDGE ABOUT
2007cites this paper
Automatic thesaurus extraction for Icelandic
2007cites this paper
Automatic acquisition of lexical knowledge about multiword predicates
2007cites this paper
ÓÚÏøÓÓœ ̄ ” ò · ì ¿ î Word Sense Disambiguation : The State of the Art
2007cites this paper
Conceptual Clustering of Korean ConcordancesUsing Similarity between
2007cites this paper
Clustering Words by Syntactical Behavior
2007influential citation
Semantic Similarity Measure of Polish Nouns Based on Linguistic Features
2007cites this paper
Automatically Constructing a Lexicon of Verb Phrase Idiomatic Combinations
2006cites this paper
WebSim : A Novel Term Similarity Metric based on a Web Search Technology
2006cites this paper
WebSim : A Pathway to Unveiling Term Relationships using a Web Search Technology
2006cites this paper
A Web-Based Novel Term Similarity Framework for Ontology Learning
2006cites this paper
A Clustering Approach for Nearly Unsupervised Recognition of Nonliteral Language
2006cites this paper
Probabilistic logic with minimum perplexity: Application to language modeling
2005cites this paper
Name disambiguation in author citations using a K-way spectral clustering method
2005cites this paper
Word Sense Disambiguation: The State of the Art
2005cites this paper
Using distributional similarity to organise biomedical terminology
2005cites this paper
Ontology Learning from Text: A Survey of Methods
2005cites this paper
From distributional to semantic similarity
2004cites this paper
UBB system at Senseval-3
2004cites this paper
Learning Subjective Language
2004cites this paper
Data Driven Approaches to Speech and Language Processing
2004cites this paper
Self-organizing semantic maps and its application to word alignment in Japanese-Chinese parallel corpora
2004cites this paper
Two supervised learning approaches for name disambiguation in author citations
2004cites this paper
Using WordNet Lexical Database and Internet to Disambiguate Word Senses
2003cites this paper
Task adaptation in stochastic language model for Chinese homophone disambiguation
2003cites this paper
Towards efficient statistical parsing using lexicalized grammatical information
2002cites this paper
Self-Organizing Chinese and Japanese Semantic Maps
2002cites this paper
AN APPROACH TO DOCUMENT ENGINEERING AT FUNDACIÓN
2002cites this paper
Natural language processing with neural networks
2002cites this paper
Ontology enrichment with texts from the WWW
2002cites this paper
Text segmentation using a cache memory
2002cites this paper
An HMM Approach to Vowel Restoration in Arabic and Hebrew
2002cites this paper
AN APPROACH TO DOCUMENT ENGINEERING AT FUNDACIÓN HULLERA VASCO-LEONESA
2002cites this paper
NRRC Summer Workshop on Multiple-Perspective Question Answering Final Report
2002cites this paper
Word Alignment in English-Chinese Parallel Corpora
2002influential citation
Neural Network Approach to Adaptive Learning with an Application to Chinese Homophone Disambiguation
2002cites this paper
Word clustering and disambiguation based on co-occurrence data
2002cites this paper
Machine Learning for Information Extraction
2001cites this paper
An Adaptive Algor ithm for Learning Changes in Run-Time Context Domain
2001cites this paper
Aide à la conception de méthodes de classification pour la construction d'ontologies : l'atelier Mo'K
2001influential citation
Smoothing a probablistic lexicon via syntactic transformations
2001cites this paper
Can bilingual word alignment improve monolingual phrasal term extraction
2001cites this paper
Trucks: a model for automatic multiword term recognition
2001cites this paper