Predicting the relevance of distributional semantic similarity with contextual information

Philippe Muller,C. Fabre,Clémentine Adam

Published 2014 in Annual Meeting of the Association for Computational Linguistics

ABSTRACT

Using distributional analysis methods to compute semantic proximity links between words has become commonplace in NLP. The resulting relations are often noisy or difficult to interpret in general. This paper focuses on the issues of evaluating a distributional resource and filtering the relations it contains, but instead of considering it in abstracto, we focus on pairs of words in context. In a discourse, we are interested in knowing if the semantic link between two items is a byproduct of textual coherence or is irrelevant. We first set up a human annotation of semantic links with or without contextual information to show the importance of the textual context in evaluating the relevance of semantic similarity, and to assess the prevalence of actual semantic relations between word tokens. We then built an experiment to automatically predict this relevance, evaluated on the reliable reference data set which was the outcome of the first annotation. We show that in-document information greatly improve the prediction made by the similarity level alone.

PUBLICATION RECORD

Publication year
2014
Venue
Annual Meeting of the Association for Computational Linguistics
Publication date
2014-06-23
Fields of study
Linguistics, Computer Science
Identifiers
DOI 10.3115/v1/P14-1045
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Identifying Bad Semantic Neighbors for Improving Distributional Thesauri
2013cited by this paper
*SEM 2013 shared task: Semantic Textual Similarity
2013cited by this paper
Enhancing lexical cohesion measure with confidence measures, semantic relations and language model interpolation for multimedia spoken content topic segmentation
2012cited by this paper
Using Distributional Similarity for Lexical Expansion in Knowledge-based Word Sense Disambiguation
2012cited by this paper
FreDist : Construction automatique d’un thésaurus distributionnel pour le Français (FreDist : Automatic construction of distributional thesauri for French)
2011cited by this paper
How we BLESSed distributional semantic evaluation
2011cited by this paper
Distributional Memory: A General Framework for Corpus-Based Semantics
2010cited by this paper
A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches
2009cited by this paper
Library 2.0: A guide to participatory library service
2008cited by this paper
Do we Still Need Gold Standards for Evaluation?
2008cited by this paper
Automatic lexico-semantic acquisition for question answering
2008cited by this paper
A Uniform Approach to Analogies, Synonyms, Antonyms, and Associations
2008cited by this paper
A Comparison of Co-occurrence and Similarity Measures as Simulations of Context
2008cited by this paper
Dependency-Based Construction of Semantic Space Models
2007cited by this paper
Topic Segmentation Algorithms for Text Summarization and Passage Retrieval: An Exhaustive Evaluation
2007cited by this paper
Corpus-based and Knowledge-based Measures of Text Semantic Similarity
2006cited by this paper
Towards pertinent evaluation methodologies for word-space models
2006cited by this paper
Co-occurrence Retrieval: A Flexible Framework for Lexical Distributional Similarity
2005cited by this paper
New Experiments in Distributional Representations of Synonymy
2005cited by this paper
From distributional to semantic similarity
2004cited by this paper
Non-Classical Lexical Semantic Relations
2004cited by this paper
Measures and applications of lexical distributional similarity
2003cited by this paper
Placing search in context: the concept revisited
2002cited by this paper
UPERY : un outil d’analyse distributionnelle étendue pour la construction d’ontologies à partir de corpus
2002cited by this paper
SMOTE: Synthetic Minority Over-sampling Technique
2002cited by this paper
Improvements in Automatic Thesaurus Extraction
2002cited by this paper
Random Forests
2001cited by this paper
MetaCost: a general method for making classifiers cost-sensitive
1999cited by this paper
An Information-Theoretic Definition of Similarity
1998cited by this paper
Towards Better NLP System Evaluation
1994cited by this paper
Explorations in automatic thesaurus discovery
1994cited by this paper
One Sense Per Discourse
1992cited by this paper
Word Association Norms, Mutual Information, and Lexicography
1989cited by this paper
A theory of term importance in automatic text analysis
1974cited by this paper

CITED BY

Using Textual Pre-Processing and Text Mining to Create Semantic Links
2019cites this paper
Neural Metaphor Detecting with CNN-LSTM Model
2018cites this paper
Supervised Word-Level Metaphor Detection: Experiments with Concreteness and Reweighting of Examples
2015cites this paper
Distributional Semantics Today - Introduction to the special issue
2015cites this paper