Sparse Bilingual Word Representations for Cross-lingual Lexical Entailment

Published 2016 in North American Chapter of the Association for Computational Linguistics

ABSTRACT

We introduce the task of cross-lingual lexical entailment, which aims to detect whether the meaning of a word in one language can be inferred from the meaning of a word in another language. We construct a gold standard for this task, and propose an unsupervised solution based on distributional word representations. As commonly done in the monolingual setting, we assume a worde entails a wordf if the prominent context features of e are a subset of those of f . To address the challenge of comparing contexts across languages, we propose a novel method for inducing sparse bilingual word representations from monolingual and parallel texts. Our approach yields an Fscore of 70%, and significantly outperforms strong baselines based on translation and on existing word representations.

PUBLICATION RECORD

Publication year
2016
Venue
North American Chapter of the Association for Computational Linguistics
Publication date
2016-06-01
Fields of study
Linguistics, Computer Science
Identifiers
DOI 10.18653/v1/N16-1142
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Retrofitting Sense-Specific Word Vectors Using Parallel Text
2016cited by this paper
Deriving Boolean structures from distributional vectors
2015cited by this paper
FASTA: A Generalized Implementation of Forward-Backward Splitting
2015cited by this paper
A Compositional and Interpretable Semantic Space
2015cited by this paper
Do Supervised Distributional Methods Really Learn Lexical Inference Relations?
2015cited by this paper
Translation Invariant Word Embeddings
2015cited by this paper
Deep Multilingual Correlation for Improved Word Embeddings
2015cited by this paper
Ontologically Grounded Multi-sense Representation Learning for Semantic Vector Space Models
2015cited by this paper
Sparse Overcomplete Word Vector Representations
2015cited by this paper
Adding Semantics to Data-Driven Paraphrasing
2015cited by this paper
A Field Guide to Forward-Backward Splitting with a FASTA Implementation
2014cited by this paper
GloVe: Global Vectors for Word Representation
2014cited by this paper
Experiments with three approaches to recognizing lexical entailment
2014cited by this paper
Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space
2014cited by this paper
Multilingual Models for Compositional Distributed Semantics
2014influential reference
Learning Word Representations with Hierarchical Sparse Coding
2014cited by this paper
BilBOWA: Fast Bilingual Distributed Representations without Word Alignments
2014cited by this paper
Statistical Machine Translation
2014cited by this paper
Learning Sense-specific Word Embeddings By Exploiting Bilingual Resources
2014cited by this paper
A Simple, Fast, and Effective Reparameterization of IBM Model 2
2013cited by this paper
Squibs: What Is a Paraphrase?
2013cited by this paper
Inducing Crosslingual Distributed Representations of Words
2012cited by this paper
Learning Effective and Interpretable Semantic Models using Non-Negative Sparse Embedding
2012cited by this paper
Improving Word Representations via Global Context and Multiple Word Prototypes
2012cited by this paper
Challenges for a multilingual wordnet
2012cited by this paper
Regularized Interlingual Projections: Evaluation on Multilingual Transliteration
2012cited by this paper
The cross-lingual lexical substitution task
2012cited by this paper
BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network
2012influential reference
Entailment above the word level in distributional semantics
2012cited by this paper
How we BLESSed distributional semantic evaluation
2011cited by this paper
Extracting Multilingual Topics from Unaligned Comparable Corpora
2010cited by this paper
Towards Cross-Lingual Textual Entailment
2010cited by this paper
Multi-Prototype Vector-Space Models of Word Meaning
2010cited by this paper
Directional distributional similarity for lexical inference
2010influential reference
Articles: Bootstrapping Distributional Feature Vector Quality
2009cited by this paper
Online dictionary learning for sparse coding
2009cited by this paper
Presenter : HMW Category : graphical models Preference : Oral Polylingual Topic Models
2009cited by this paper
SemEval-2010 Task 3: Cross-lingual Word Sense Disambiguation
2009cited by this paper
Directional Distributional Similarity for Lexical Expansion
2009cited by this paper
Moses: Open Source Toolkit for Statistical Machine Translation
2007cited by this paper
Improving Statistical Machine Translation Using Word Sense Disambiguation
2007cited by this paper
Improving Translation Quality by Discarding Most of the Phrasetable
2007cited by this paper
The Distributional Inclusion Hypotheses and Lexical Entailment
2005cited by this paper
Europarl: A Parallel Corpus for Statistical Machine Translation
2005cited by this paper
A Geometric View on Bilingual Lexicon Extraction from Comparable Corpora
2004cited by this paper
An Unsupervised Method for Word Sense Tagging using Parallel
2002cited by this paper
An Unsupervised Method for Word Sense Tagging using Parallel Corpora
2002cited by this paper
Looking for lexical gaps
2000cited by this paper
Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms
1998cited by this paper
Automatic Retrieval and Clustering of Similar Words
1998influential reference
A method for disambiguating word senses in a large corpus
1992cited by this paper
Fifth Conference of the European Chapter of the Association for Computational Linguistics
1991cited by this paper
Lexical gaps and idioms in machine translation
1990cited by this paper

CITED BY

Sememe Prediction for BabelNet Synsets using Multilingual and Multimodal Information
2022cites this paper
FedKC: Federated Knowledge Composition for Multilingual Natural Language Understanding
2022cites this paper
Towards Learning Language Agnostic Features for NLP in Low-resource Languages
2021cites this paper
SemEval-2020 Task 2: Predicting Multilingual and Cross-Lingual (Graded) Lexical Entailment
2020cites this paper
Hypernymy Detection for Low-Resource Languages via Meta Learning
2020cites this paper
Unsupervised Cross-Lingual Mapping for Phrase Embedding Spaces
2020cites this paper
Cross-lingual embedding for cross-lingual question retrieval in low-resource community question answering
2020cites this paper
UAlberta at SemEval-2020 Task 2: Using Translations to Predict Cross-Lingual Entailment
2020cites this paper
Massively Multilingual Sparse Word Representations
2020cites this paper
Cross-lingual Transfer Learning and Multitask Learning for Capturing Multiword Expressions
2019cites this paper
Weakly Supervised Cross-lingual Semantic Relation Classification via Knowledge Distillation
2019cites this paper
Towards Building a Multilingual Sememe Knowledge Base: Predicting Sememes for BabelNet Synsets
2019cites this paper
Interpretability of Hungarian embedding spaces using a knowledge base
2019cites this paper
A Family of Fuzzy Orthogonal Projection Models for Monolingual and Cross-lingual Hypernymy Prediction
2019cites this paper
Generalized Tuning of Distributional Word Vectors for Monolingual and Cross-Lingual Lexical Entailment
2019cites this paper
Multilingual and Cross-Lingual Graded Lexical Entailment
2019influential citation
Exploiting Cross-Lingual Representations For Natural Language Processing
2019cites this paper
Harnessing sense-level information for semantically augmented knowledge extraction
2018cites this paper
Robust Cross-Lingual Hypernymy Detection Using Dependency Context
2018influential citation
Lessons from the Bible on Modern Topics: Low-Resource Multilingual Topic Model Evaluation
2018cites this paper
WBI at CLEF eHealth 2018 Task 1: Language-independent ICD-10 Coding using Multi-lingual Embeddings and Recurrent Neural Networks
2018cites this paper
Learning to Represent Bilingual Dictionaries
2018cites this paper
A Word Embeddings Training Method Based on Modified Skip-Gram and Align
2018cites this paper
Unsupervised Hypernym Detection by Distributional Inclusion Vector Embedding
2017cites this paper
A survey of cross-lingual embedding models
2017cites this paper
Distributional Inclusion Vector Embedding for Unsupervised Hypernymy Detection
2017cites this paper
EuroSense: Automatic Harvesting of Multilingual Sense Annotations from Parallel Text
2017cites this paper
A Survey of Cross-lingual Word Embedding Models
2017cites this paper
Cross-Lingual Syntactically Informed Distributed Word Representations
2017cites this paper
Advances in Brain Inspired Cognitive Systems
2016cites this paper
Deep and Sparse Learning in Speech and Language Processing: An Overview
2016cites this paper