Sparse Overcomplete Word Vector Representations

Manaal Faruqui,Yulia Tsvetkov,Dani Yogatama,Chris Dyer,Noah A. Smith

Published 2015 in Annual Meeting of the Association for Computational Linguistics

ABSTRACT

Current distributed representations of words show little resemblance to theories of lexical semantics. The former are dense and uninterpretable, the latter largely based on familiar, discrete classes (e.g., supersenses) and relations (e.g., synonymy and hypernymy). We propose methods that transform word vectors into sparse (and optionally binary) vectors. The resulting representations are more similar to the interpretable features typically used in NLP, though they are discovered automatically from raw corpora. Because the vectors are highly sparse, they are computationally easy to work with. Most importantly, we find that they outperform the original vectors on benchmark tasks.

PUBLICATION RECORD

Publication year
2015
Venue
Annual Meeting of the Association for Computational Linguistics
Publication date
2015-06-05
Fields of study
Mathematics, Computer Science
Identifiers
DOI 10.3115/v1/P15-1144 arXiv 1506.02004
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

A Compositional and Interpretable Semantic Space
2015cited by this paper
Improving Vector Space Word Representations Using Multilingual Correlation
2014cited by this paper
Learning Word Representations with Hierarchical Sparse Coding
2014cited by this paper
Tailoring Continuous Word Representations for Dependency Parsing
2014cited by this paper
Combining Formal and Distributional Models of Temporal and Intensional Semantics
2014cited by this paper
Linguistic Structured Sparsity in Text Categorization
2014cited by this paper
Asynchronous stochastic optimization for sequence training of deep neural networks: towards big data
2014cited by this paper
SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation
2014cited by this paper
Revisiting Embedding Features for Simple Semi-supervised Learning
2014cited by this paper
A practical and linguistically-motivated approach to compositional distributional semantics
2014cited by this paper
GloVe: Global Vectors for Word Representation
2014cited by this paper
RC-NET: A General Framework for Incorporating Knowledge into Word Representations
2014cited by this paper
Improving Lexical Embeddings with Semantic Knowledge
2014cited by this paper
Applied Neuropsychology: Child. Special section: Large scale brain systems: implications for pediatric neuropsychological evaluation. Summary.
2014cited by this paper
Interpretable Semantic Vectors from a Joint Model of Brain- and Text- Based Meaning
2014cited by this paper
Efficient Estimation of Word Representations in Vector Space
2013cited by this paper
Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
2013cited by this paper
Studying the Recursive Behaviour of Adjectival Modification with Compositional Distributional Semantics
2013cited by this paper
Fish Transporters and Miracle Homes: How Compositional Distributional Semantics can Help NP Parsing
2013cited by this paper
Towards a Formal Distributional Semantics: Simulating Logical Calculi with Tensors
2013cited by this paper
Detecting Compositionality of Multi-Word Expressions using Nearest Neighbours in Vector Space Models
2013cited by this paper
Combined Distributional and Logical Semantics
2013cited by this paper
Improving Word Representations via Global Context and Multiple Word Prototypes
2012cited by this paper
Representation Learning: A Review and New Perspectives
2012cited by this paper
Learning Effective and Interpretable Semantic Models using Non-Negative Sparse Embedding
2012cited by this paper
Factorial LDA: Sparse Multi-Dimensional Text Models
2012cited by this paper
Structured Sparsity in Structured Prediction
2011cited by this paper
Adaptive Subgradient Methods for Online Learning and Stochastic Optimization
2011cited by this paper
Sparse Topical Coding
2011cited by this paper
Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach
2011cited by this paper
From Frequency to Meaning: Vector Space Models of Semantics
2010cited by this paper
Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling
2010cited by this paper
Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization
2009cited by this paper
Survey Article: Inter-Coder Agreement for Computational Linguistics
2009cited by this paper
Reading Tea Leaves: How Humans Interpret Topic Models
2009cited by this paper
Exponential Family Sparse Coding with Application to Self-taught Learning
2009cited by this paper
Posterior vs Parameter Sparsity in Latent Variable Models
2009cited by this paper
Nonnegative Matrix and Tensor Factorizations - Applications to Exploratory Multi-way Data Analysis and Blind Source Separation
2009cited by this paper
Sparse inverse covariance estimation with the graphical lasso.
2008cited by this paper
A Bayesian LDA-based model for semi-supervised part-of-speech tagging
2007cited by this paper
Stable recovery of sparse overcomplete representations in the presence of noise
2006cited by this paper
Efficient sparse coding algorithms
2006cited by this paper
Verbnet: a broad-coverage, comprehensive verb lexicon
2005cited by this paper
Exponential Priors for Maximum Entropy Models
2004cited by this paper
Non-negative matrix factorization for visual coding
2003cited by this paper
Evaluation and Extension of Maximum Entropy Models with Inequality Constraints
2003cited by this paper
Non-negative sparse coding
2002cited by this paper
Placing search in context: the concept revisited
2002cited by this paper
Learning Question Classifiers
2002cited by this paper
Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL
2001cited by this paper
Learning Overcomplete Representations
2000cited by this paper
Learning the parts of objects by non-negative matrix factorization
1999cited by this paper
The Berkeley FrameNet Project
1998cited by this paper
Sparse coding with an overcomplete basis set: a strategy employed by V1?
1997cited by this paper
Research Design & Statistical Analysis
1995cited by this paper
WordNet: A Lexical Database for English
1995cited by this paper
English Verb Classes and Alternations: A Preliminary Investigation
1993cited by this paper
Building a Large Annotated Corpus of English: The Penn Treebank
1993cited by this paper
Jointly Constrained Biconvex Programming
1983cited by this paper
Measuring Agreement for Multinomial Data
1982cited by this paper
The measurement of observer agreement for categorical data.
1977cited by this paper

CITED BY

When Numbers Tell Half the Story: Human-Metric Alignment in Topic Model Evaluation
2026cites this paper
Disentangling Superpositions: Interpretable Brain Encoding Model with Sparse Concept Atoms
2026cites this paper
A Survey on Sparse Autoencoders: Interpreting the Internal Mechanisms of Large Language Models
2025cites this paper
BehaviorBox: Automated Discovery of Fine-Grained Performance Differences Between Language Models
2025cites this paper
The Geometry of Tokens in Internal Representations of Large Language Models
2025cites this paper
Dense SAE Latents Are Features, Not Bugs
2025cites this paper
FluxSpace: Disentangled Semantic Editing in Rectified Flow Models
2025cites this paper
Exploring Task Performance with Interpretable Models via Sparse Auto-Encoders
2025cites this paper
Projecting Assumptions: The Duality Between Sparse Autoencoders and Concept Geometry
2025cites this paper
Interpretable Text Embeddings and Text Similarity Explanation: A Survey
2025cites this paper
SAGE: An Agentic Explainer Framework for Interpreting SAE Features in Language Models
2025cites this paper
A Non-Parametric Bayesian Approach Towards OnlineSequence Learning
2025cites this paper
Analysis of Variational Sparse Autoencoders
2025cites this paper
FaithfulSAE: Towards Capturing Faithful Features with Sparse Autoencoders without External Dataset Dependencies
2025cites this paper
AccelES: Accelerating Top-K SpMV for Embedding Similarity via Low-bit Pruning
2025influential citation
Transferring Linear Features Across Language Models With Model Stitching
2025cites this paper
On the Geometry of Semantics in Next-token Prediction
2025cites this paper
Dictionary Learning: The Complexity of Learning Sparse Superposed Features with Feedback
2025cites this paper
CorrSteer: Steering Improves Task Performance and Safety in LLMs through Correlation-based Sparse Autoencoder Feature Selection
2025cites this paper
CorrSteer: Generation-Time LLM Steering via Correlated Sparse Autoencoder Features
2025cites this paper
Geometry of Semantics in Next-Token Prediction: How Optimization Implicitly Organizes Linguistic Representations
2025cites this paper
Disentangling Linguistic Features with Dimension-Wise Analysis of Vector Embeddings
2025cites this paper
Refusal Behavior in Large Language Models: A Nonlinear Perspective
2025cites this paper
From superposition to sparse codes: interpretable representations in neural networks
2025cites this paper
eXplainable AI for Word Embeddings: A Survey
2024cites this paper
The Geometry of Concepts: Sparse Autoencoder Feature Structure
2024cites this paper
Explainable text representation method with a learnable and explicit semantic space
2024cites this paper
Towards Principled Evaluations of Sparse Autoencoders for Interpretability and Control
2024cites this paper
Interpreting CLIP with Sparse Linear Concept Embeddings (SpLiCE)
2024cites this paper
Decomposing Co-occurrence Matrices into Interpretable Components as Formal Concepts
2024cites this paper
Revisiting Cosine Similarity via Normalized ICA-transformed Embeddings
2024cites this paper
Analyzing Semantic Properties of Word Embeddings Using Eigenvectors
2024cites this paper
Identifying Functionally Important Features with End-to-End Sparse Dictionary Learning
2024cites this paper
Measuring Feature Sparsity in Language Models
2023cites this paper
DINE: Dimensional Interpretability of Node Embeddings
2023cites this paper
Biologically Plausible Sparse Temporal Word Representations
2023cites this paper
Interpretable Neural Embeddings with Sparse Self-Representation
2023cites this paper
Unbalanced Multi-view Deep Learning
2023cites this paper
SensePOLAR: Word sense aware interpretability for pre-trained contextual word embeddings
2023cites this paper
De l’interprétabilité des dimensions à l’interprétabilité du vecteur : parcimonie et stabilité
2023cites this paper
Sparser is better: one step closer to word embedding interpretability
2023influential citation
Discovering Universal Geometry in Embeddings with ICA
2023cites this paper
Sparse Autoencoders Find Highly Interpretable Features in Language Models
2023cites this paper
Tsetlin Machine Embedding: Representing Words Using Logical Expressions
2023cites this paper
EEND-DEMUX: End-to-End Neural Speaker Diarization via Demultiplexed Speaker Embeddings
2023cites this paper
Word Embedding Interpretation using Co-Clustering
2022cites this paper
Morphologically-Aware Vocabulary Reduction of Word Embeddings
2022cites this paper
Regularized online tensor factorization for sparse knowledge graph embeddings
2022cites this paper
Analogy-Guided Evolutionary Pretraining of Binary Word Embeddings
2022cites this paper
Deep MinCut: Learning Node Embeddings by Detecting Communities
2022cites this paper
On the Explainability of Natural Language Processing Deep Models
2022influential citation
Simplicial Embeddings in Self-Supervised Learning and Downstream Classification
2022influential citation
Are Embedding Spaces Interpretable? Results of an Intrusion Detection Evaluation on a Large French Corpus
2022influential citation
Gender bias in legal corpora and debiasing it
2022cites this paper
Learning interpretable word embeddings via bidirectional alignment of dimensions with semantic concepts
2022cites this paper
Emergent organization of receptive fields in networks of excitatory and inhibitory neurons
2022cites this paper
An analytical model of website relationships based on browsing history embedding considerations of page transitions
2021influential citation
Interpretable contrastive word mover's embedding
2021cites this paper
A Survey on Green Deep Learning
2021cites this paper
Reducing explicit word vectors dimensions using BPSO-based labeling algorithm and voting method
2021cites this paper
Ultra-High Dimensional Sparse Representations with Binarization for Efficient Text Retrieval
2021cites this paper
Biomedical Interpretable Entity Representations
2021cites this paper
Game-theoretic Vocabulary Selection via the Shapley Value and Banzhaf Index
2021cites this paper
Transformer visualization via dictionary learning: contextualized embedding as a linear superposition of transformer factors
2021cites this paper
UHD-BERT: Bucketed Ultra-High Dimensional Sparse Representations for Full Ranking
2021cites this paper
Extending Multi-Sense Word Embedding to Phrases and Sentences for Unsupervised Semantic Applications
2021cites this paper
Plongements Interprétables pour la Détection de Biais Cachés (Interpretable Embeddings for Hidden Biases Detection)
2021influential citation
Words as a window: Using word embeddings to explore the learned representations of Convolutional Neural Networks
2021cites this paper
Learning Sparse Sentence Encoding without Supervision: An Exploration of Sparsity in Variational Autoencoders
2021cites this paper
Retaining Semantic Data in Binarized Word Embedding
2021cites this paper
Emoji-powered Sentiment and Emotion Detection from Software Developers’ Communication Data
2021cites this paper
SEMIE: SEMantically Infused Embeddings with Enhanced Interpretability for Domain-specific Small Corpus
2021influential citation
Towards Data-Efficient Machine Learning
2021cites this paper
A Word Selection Method for Producing Interpretable Distributional Semantic Word Vectors
2021cites this paper
ProsperAMnet at the FinSim Task: Detecting hypernyms of financial concepts via measuring the information stored in sparse word representations
2021cites this paper
Neuron-level Interpretation of Deep NLP Models: A Survey
2021cites this paper
Inherently Interpretable Sparse Word Embeddings through Sparse Coding
2020cites this paper
Learning from Textual Data in Database Systems
2020cites this paper
Sparsity Makes Sense: Word Sense Disambiguation Using Sparse Contextualized Word Representations
2020cites this paper
A comprehensive analysis of the parameters in the creation and comparison of feature vectors in distributional semantic models for multiple languages
2020cites this paper
Sparse Lifting of Dense Vectors: A Unified Approach to Word and Sentence Representations
2020cites this paper
Compressing Transformer-Based Semantic Parsing Models using Compositional Code Embeddings
2020cites this paper
The POLAR Framework: Polar Opposites Enable Interpretability of Pre-Trained Word Embeddings
2020cites this paper
Compression of Deep Learning Models for Text: A Survey
2020cites this paper
Understanding the Semantic Content of Sparse Word Embeddings Using a Commonsense Knowledge Base
2020cites this paper
The Explanation Game: Towards Prediction Explainability through Sparse Communication
2020cites this paper
Word Embedding Binarization with Semantic Information Preservation
2020cites this paper
Hierarchical Sparse Variational Autoencoder for Text Encoding
2020cites this paper
The Sparse Manifold Transform and Unsupervised Learning for Signal Representation
2020cites this paper
Neural Interface for Web-Scale Knowledge
2020cites this paper
Mining Semantic Subspaces to Express Discipline-Specific Similarities
2020cites this paper
RANCC: Rationalizing Neural Networks via Concept Clustering
2020cites this paper
DUKMSVM: A Framework of Deep Uniform Kernel Mapping Support Vector Machine for Short Text Classification
2020cites this paper
Towards Prediction Explainability through Sparse Communication
2020cites this paper
Conception: Multilingually-Enhanced, Human-Readable Concept Vector Representations
2020cites this paper
Information Retrieval Based on Knowledge-Enhanced Word Embedding Through Dialog: A Case Study
2020cites this paper
Knowledge Discovery from Big Text Data
2020cites this paper
Word Equations: Inherently Interpretable Sparse Word Embeddings through Sparse Coding
2020influential citation
Adaptive Probabilistic Word Embedding
2020cites this paper
The Fluidity of Concept Representations in Human Brain Signals
2020cites this paper