Distributed Bayesian matrix factorization with limited communication

Xiangju Qin,P. Blomstedt,Eemeli Leppäaho,P. Parviainen,Samuel Kaski

Published 2017 in Machine-mediated learning

ABSTRACT

Bayesian matrix factorization (BMF) is a powerful tool for producing low-rank representations of matrices and for predicting missing values and providing confidence intervals. Scaling up the posterior inference for massive-scale matrices is challenging and requires distributing both data and computation over many workers, making communication the main computational bottleneck. Embarrassingly parallel inference would remove the communication needed, by using completely independent computations on different data subsets, but it suffers from the inherent unidentifiability of BMF solutions. We introduce a hierarchical decomposition of the joint posterior distribution, which couples the subset inferences, allowing for embarrassingly parallel computations in a sequence of at most three stages. Using an efficient approximate implementation, we show improvements empirically on both real and simulated data. Our distributed approach is able to achieve a speed-up of almost an order of magnitude over the full posterior, with a negligible effect on predictive accuracy. Our method outperforms state-of-the-art embarrassingly parallel MCMC methods in accuracy, and achieves results competitive to other available distributed and parallel implementations of BMF.

PUBLICATION RECORD

Publication year
2017
Venue
Machine-mediated learning
Publication date
2017-03-02
Fields of study
Mathematics, Computer Science
Identifiers
DOI 10.1007/s10994-019-05778-2 arXiv 1703.00734
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Policy-Conditioned Uncertainty Sets for Robust Markov Decision Processes
2018cited by this paper
Parallelized Stochastic Gradient Markov Chain Monte Carlo algorithms for non-negative matrix factorization
2017cited by this paper
Patterns of Scalable Bayesian Inference
2016cited by this paper
The MovieLens Datasets: History and Context
2016influential reference
Distributed Bayesian Probabilistic Matrix Factorization
2016cited by this paper
Exploring Parallel Implementations of the Bayesian Probabilistic Matrix Factorization
2016cited by this paper
Lambda means clustering: Automatic parameter search and distributed computing implementation
2016cited by this paper
Merging MCMC Subposteriors through Gaussian-Process Approximations
2016cited by this paper
Asynchronous Parallel Stochastic Gradient for Nonconvex Optimization
2015cited by this paper
Parallelizing MCMC with Random Partition Trees
2015cited by this paper
Scalable Bayes via Barycenter in Wasserstein Space
2015cited by this paper
Macau: Scalable Bayesian Multi-relational Factorization with Side Information using MCMC
2015cited by this paper
UvA-DARE (Digital Academic Repository) Large-Scale Distributed Bayesian Matrix Factorization using Stochastic Gradient Large-Scale Distributed Bayesian Matrix Factorization using Stochastic Gradient MCMC
2015cited by this paper
Expectation propagation as a way of life ∗
2014cited by this paper
Robust and Scalable Bayes via a Median of Subset Posterior Measures
2014cited by this paper
Probabilistic Matrix Factorization with Non-random Missing Data
2014cited by this paper
Distributed Bayesian Posterior Sampling via Moment Sharing
2014cited by this paper
Median Selection Subset Aggregation for Parallel Inference
2014cited by this paper
Parallelizing MCMC via Weierstrass Sampler
2013cited by this paper
The ChEMBL bioactivity database: an update
2013cited by this paper
Asymptotically Exact, Embarrassingly Parallel MCMC
2013cited by this paper
Predicting Drug–Target Interactions Using Probabilistic Matrix Factorization
2013cited by this paper
Sparse Bayesian infinite factor models.
2011cited by this paper
Incorporating Side Information in Probabilistic Matrix Factorization with Gaussian Processes
2010cited by this paper
Bayesian Matrix Factorization with Side Information and Dirichlet Process Mixtures
2010cited by this paper
Matrix Factorization Techniques for Recommender Systems
2009cited by this paper
The BellKor Solution to the Netflix Grand Prize
2009cited by this paper
Bayesian probabilistic matrix factorization using Markov chain Monte Carlo
2008influential reference
Probabilistic Matrix Factorization
2007influential reference
BAYESIAN MODEL ASSESSMENT IN FACTOR ANALYSIS
2004cited by this paper
Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Hierarchical Bayesian Matrix Factorization with Side Information
year unknowncited by this paper

CITED BY

An introduction to collaborative filtering through the lens of the Netflix Prize
2025cites this paper
Matrix Factorization Techniques in Machine Learning, Signal Processing, and Statistics
2023cites this paper
Distributed Bayesian Matrix Decomposition for Big Data Mining and Clustering
2020cites this paper
A High-Performance Implementation of Bayesian Matrix Factorization with Limited Communication
2020cites this paper
Scalable Bayesian Non-linear Matrix Completion
2019cites this paper