Fast Learning of Clusters and Topics via Sparse Posteriors

Published 2016 in arXiv.org

ABSTRACT

Mixture models and topic models generate each observation from a single cluster, but standard variational posteriors for each observation assign positive probability to all possible clusters. This requires dense storage and runtime costs that scale with the total number of clusters, even though typically only a few clusters have significant posterior mass for any data point. We propose a constrained family of sparse variational distributions that allow at most $L$ non-zero entries, where the tunable threshold $L$ trades off speed for accuracy. Previous sparse approximations have used hard assignments ($L=1$), but we find that moderate values of $L>1$ provide superior performance. Our approach easily integrates with stochastic or incremental optimization algorithms to scale to millions of examples. Experiments training mixture models of image patches and topic models for news articles show that our approach produces better-quality models in far less time than baseline methods.

PUBLICATION RECORD

Publication year
2016
Venue
arXiv.org
Publication date
2016-09-23
Fields of study
Mathematics, Computer Science
Identifiers
arXiv 1609.07521
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Reliable and Scalable Variational Inference for the Hierarchical Dirichlet Process
2015cited by this paper
Reducing the sampling complexity of topic models
2014cited by this paper
LightLDA: Big Topic Models on Modest Compute Clusters
2014influential reference
Memoized Online Variational Inference for Dirichlet Process Mixture Models
2013cited by this paper
Streaming Variational Bayes
2013influential reference
A truncated EM approach for spike-and-slab sparse coding
2012cited by this paper
Sparse stochastic inference for latent Dirichlet allocation
2012influential reference
"Natural Images, Gaussian Mixtures and Dead Leaves"
2012cited by this paper
Probabilistic topic models
2012cited by this paper
Truncation-free Online Variational Inference for Bayesian Nonparametric Models
2012cited by this paper
Stochastic variational inference
2012cited by this paper
Online Variational Inference for the Hierarchical Dirichlet Process
2011cited by this paper
Revisiting k-means: New Algorithms via Bayesian Nonparametrics
2011cited by this paper
Expectation Truncation and the Benefits of Preselection In Training Generative Models
2010influential reference
Online Learning for Latent Dirichlet Allocation
2010cited by this paper
Bayesian k-Means as a Maximization-Expectation Algorithm
2009cited by this paper
Coresets and approximate clustering for Bregman divergences
2009cited by this paper
Latent Dirichlet Allocation
2009influential reference
Efficient methods for topic model inference on streaming document collections
2009influential reference
Graphical Models, Exponential Families, and Variational Inference
2008cited by this paper
k-means++: the advantages of careful seeding
2007cited by this paper
Variational Message Passing
2005cited by this paper
Speeding up the EM algorithm for mixture model-based segmentation of magnetic resonance images
2004cited by this paper
Finding scientific topics
2004cited by this paper
Propagation Algorithms for Variational Bayesian Learning
2000influential reference
A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants
1998influential reference
Very Fast EM-Based Mixture Model Clustering Using Multiresolution Kd-Trees
1998cited by this paper
Introspective Sorting and Selection Algorithms
1997influential reference
Bayesian Mixture Modeling
1992cited by this paper
The segmental K-means algorithm for estimating parameters of hidden Markov models
1990cited by this paper
Least squares quantization in PCM
1982cited by this paper
Finite Mixture Distributions
1982cited by this paper
Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper
1977cited by this paper
Time Bounds for Selection
1973cited by this paper

CITED BY

Large-scale entity resolution via microclustering Ewens--Pitman random partitions
2025influential citation
Fitting large mixture models using stochastic component selection
2021cites this paper
Direct Evolutionary Optimization of Variational Autoencoders With Binary Latents
2020cites this paper
Evolutionary Variational Optimization of Generative Models
2020cites this paper
Large Scale Clustering with Variational EM for Gaussian Mixture Models
2019influential citation
Accelerated Training of Large-Scale Gaussian Mixtures by a Merger of Sublinear Approaches
2018influential citation
A Variational EM Acceleration for Efficient Clustering at Very Large Scales
2018cites this paper
Can clustering scale sublinearly with its clusters? A variational EM acceleration of GMMs and k-means
2017cites this paper
Truncated Variational Sampling for 'Black Box' Optimization of Generative Models
2017cites this paper
Truncated variational EM for semi-supervised neural simpletrons
2017cites this paper
k-means as a variational EM approximation of Gaussian mixture models
2017cites this paper
Truncated Variational Expectation Maximization
2016cites this paper
Truncated Variational Expectation Maximization
2016influential citation
Neural Simpletrons: Learning in the Limit of Few Labels with Directed Generative Networks
2015cites this paper