Exponential Family Embeddings

Maja R. Rudolph,Francisco J. R. Ruiz,Stephan Mandt,D. Blei

Published 2016 in Neural Information Processing Systems

ABSTRACT

Word embeddings are a powerful approach for capturing semantic similarity among terms in a vocabulary. In this paper, we develop exponential family embeddings, a class of methods that extends the idea of word embeddings to other types of high-dimensional data. As examples, we studied neural data with real-valued observations, count data from a market basket analysis, and ratings data from a movie recommendation system. The main idea is to model each observation conditioned on a set of other observations. This set is called the context, and the way the context is defined is a modeling choice that depends on the problem. In language the context is the surrounding words; in neuroscience the context is close-by neurons; in market basket data the context is other items in the shopping cart. Each type of embedding model defines the context, the exponential family of conditional distributions, and how the latent embedding vectors are shared across data. We infer the embeddings with a scalable algorithm based on stochastic gradient descent. On all three applications—neural activity of zebrafish, users' shopping behavior, and movie ratings—we found exponential family embedding models to be more effective than other types of dimension reduction. They better reconstruct held-out data and find interesting qualitative structure.

PUBLICATION RECORD

Publication year
2016
Venue
Neural Information Processing Systems
Publication date
2016-08-02
Fields of study
Mathematics, Computer Science
Identifiers
DOI 10.7916/D8NZ9RHT arXiv 1608.00778
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Offline bilingual word vectors, orthogonal transformations and the inverted softmax
2017cited by this paper
Word Translation Without Parallel Data
2017cited by this paper
Dynamic Bernoulli Embeddings for Language Evolution
2017cited by this paper
Explaining and Generalizing Skip-Gram through Exponential Family Principal Component Analysis
2017cited by this paper
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
2016cited by this paper
Massively Multilingual Word Embeddings
2016cited by this paper
A Bayesian Model of Diachronic Meaning Change
2016cited by this paper
Fast Constrained Non-negative Matrix Factorization for Whole-Brain Calcium Imaging Data
2016cited by this paper
Deep Learning
2016cited by this paper
Cultural Shift or Linguistic Drift? Comparing Two Computational Measures of Semantic Change
2016cited by this paper
The MovieLens Datasets: History and Context
2016cited by this paper
The Past is Not a Foreign Country: Detecting Semantically Similar Terms across Time
2016cited by this paper
Modeling User Exposure in Recommendation
2015cited by this paper
Scalable Recommendation with Hierarchical Poisson Factorization
2015cited by this paper
Document Classification by Inversion of Distributed Language Representations
2015cited by this paper
An automatic approach to identify word sense changes in text media across timescales
2015cited by this paper
Bilingual Word Representations with Monolingual Quality in Mind
2015cited by this paper
Bayesian dark knowledge
2015cited by this paper
Deep Residual Learning for Image Recognition
2015cited by this paper
Adam: A Method for Stochastic Optimization
2014influential reference
Deep Exponential Families
2014cited by this paper
Statistically Significant Detection of Linguistic Change
2014cited by this paper
That’s sick dude!: Automatic identification of word sense change across different timescales
2014cited by this paper
Word Representations via Gaussian Embedding
2014cited by this paper
GloVe: Global Vectors for Word Representation
2014cited by this paper
Temporal Analysis of Language through Neural Language Models
2014cited by this paper
Stochastic Backpropagation and Approximate Inference in Deep Generative Models
2014cited by this paper
Improving Vector Space Word Representations Using Multilingual Correlation
2014cited by this paper
Amortized Inference in Probabilistic Reasoning
2014cited by this paper
Neural Word Embedding as Implicit Matrix Factorization
2014cited by this paper
Neural Variational Inference and Learning in Belief Networks
2014cited by this paper
The Stanford CoreNLP Natural Language Processing Toolkit
2014cited by this paper
Linguistic Regularities in Continuous Space Word Representations
2013influential reference
Distributed Representations of Words and Phrases and their Compositionality
2013influential reference
Whole-brain functional imaging at cellular resolution using light-sheet microscopy
2013cited by this paper
Exploiting Similarities among Languages for Machine Translation
2013influential reference
Efficient Estimation of Word Representations in Vector Space
2013influential reference
Semantic change computation: A successive approach
2013cited by this paper
A Simple, Fast, and Effective Reparameterization of IBM Model 2
2013cited by this paper
Bilingual Word Embeddings for Phrase-Based Machine Translation
2013cited by this paper
Auto-Encoding Variational Bayes
2013cited by this paper
Distributed multinomial regression
2013cited by this paper
Learning word embeddings efficiently with noise-contrastive estimation
2013influential reference
Inducing Crosslingual Distributed Representations of Words
2012cited by this paper
A fast and simple algorithm for training neural probabilistic language models
2012cited by this paper
Building Large Monolingual Dictionaries at the Leipzig Corpora Collection: From 100 to 200 Languages
2012cited by this paper
Improving Word Representations via Global Context and Multiple Word Prototypes
2012cited by this paper
Word Epoch Disambiguation: Finding How Words Change Over Time
2012cited by this paper
Tracing semantic change with latent semantic analysis
2011cited by this paper
Understanding semantic change of words over centuries
2011cited by this paper
Natural Language Processing (Almost) from Scratch
2011cited by this paper
Adaptive Subgradient Methods for Online Learning and Stochastic Optimization
2011cited by this paper
Strategies for training large scale neural network language models
2011cited by this paper
A Language-based Approach to Measuring Scholarly Impact
2010influential reference
Bayesian data analysis.
2010cited by this paper
Understanding the difficulty of training deep feedforward neural networks
2010cited by this paper
Noise-contrastive estimation: A new estimation principle for unnormalized statistical models
2010cited by this paper
Latent Dirichlet Allocation
2009cited by this paper
Adaptive Importance Sampling to Accelerate Training of a Neural Probabilistic Language Model
2008cited by this paper
Continuous Time Dynamic Topic Models
2008cited by this paper
Visualizing Data using t-SNE
2008cited by this paper
A Scalable Hierarchical Distributed Language Model
2008cited by this paper
Collaborative Filtering for Implicit Feedback Datasets
2008influential reference
Deep learning via semi-supervised embedding
2008cited by this paper
Database Paper - The IRI Marketing Data Set
2008cited by this paper
Graphical Models, Exponential Families, and Variational Inference
2008cited by this paper
Neural Probabilistic Language Models
2006cited by this paper
Topics over time: a non-Markov continuous-time model of topical trends
2006cited by this paper
Hierarchical Probabilistic Neural Network Language Model
2005cited by this paper
Europarl: A Parallel Corpus for Statistical Machine Translation
2005cited by this paper
Quick Training of Probabilistic Neural Nets by Importance Sampling
2003influential reference
Generalized Linear Models
2002cited by this paper
A Generalization of Principal Components Analysis to the Exponential Family
2001influential reference
Conditionally Specified Distributions: An Introduction (with comments and a rejoinder by the authors)
2001cited by this paper
The Helmholtz Machine
1995cited by this paper
Fundamentals of statistical exponential families: with applications in statistical decision theory
1986cited by this paper
Learning representations by back-propagating errors
1986cited by this paper
Statistical Analysis of Non-Lattice Data
1975cited by this paper
On the theory of brownian motion
1973cited by this paper
Distributional Structure
1954cited by this paper
A Stochastic Approximation Method
1951cited by this paper

CITED BY

Posterior Sampling of Probabilistic Word Embeddings
2025cites this paper
Sense-specific Historical Word Usage Generation
2025cites this paper
The Geometry of Categorical and Hierarchical Concepts in Large Language Models
2024cites this paper
Bypassing Skip-Gram Negative Sampling: Dimension Regularization as a More Efficient Alternative for Graph Embeddings
2024cites this paper
Probabilistic Machine Learning: New Frontiers for Modeling Consumers and their Choices
2024cites this paper
On the Origins of Linear Representations in Large Language Models
2024cites this paper
BEdeepon: an in silico tool for prediction of base editor efficiencies and outcomes
2024cites this paper
TPPMI - a Temporal Positive Pointwise Mutual Information Embedding of Words
2024cites this paper
WIKITIDE: A Wikipedia-Based Timestamped Definition Pairs Dataset
2023cites this paper
Learning Rich Rankings
2023cites this paper
BRANEnet: embedding multilayer networks for omics data integration
2022cites this paper
Nonparametric exponential family graph embeddings for multiple representation learning
2022cites this paper
Domain-Specific Word Embeddings with Structure Prediction
2022cites this paper
Beer2Vec : Extracting Flavors from Reviews for Thirst-Quenching Recommandations
2022cites this paper
Probabilistic Embeddings with Laplacian Graph Priors
2022cites this paper
VocEmb4SVS: Improving Singing Voice Separation with Vocal Embeddings
2022cites this paper
Gaussian Copula Embeddings
2022influential citation
BEdeepoff: an in silico tool for off-target prediction of ABE and CBE base editors
2021cites this paper
Multiomics Data Integration for Gene Regulatory Network Inference with Exponential Family Embeddings
2021cites this paper
Pseudo-Euclidean Attract-Repel Embeddings for Undirected Graphs
2021influential citation
Representation Learning for Predicting Customer Orders
2021cites this paper
SHORING: Design Provable Conditional High-Order Interaction Network via Symbolic Testing
2021cites this paper
Dynamic Language Models for Continuously Evolving Content
2021cites this paper
Learning Product Characteristics and Consumer Preferences from Search Data
2021cites this paper
Measuring diachronic sense change: New models and Monte Carlo methods for Bayesian inference
2021cites this paper
Privacy-Aware Personalized Entity Representations for Improved User Understanding
2020cites this paper
Unsupervised speech representation learning
2020cites this paper
Lightweight Data Fusion with Conjugate Mappings
2020cites this paper
A Comparative Study of Approaches for the Diachronic Analysis of the Italian Language
2020cites this paper
A Study of Inductive Biases for Unsupervised Speech Representation Learning
2020cites this paper
DeepTriage: Automated Transfer Assistance for Incidents in Cloud Services
2020cites this paper
Learning the Compositional Visual Coherence for Complementary Recommendations
2020cites this paper
Corpus-based Comparison of Distributional Models of Language and Knowledge Graphs
2020influential citation
Compass-aligned Distributional Embeddings for Studying Semantic Differences across Corpora
2020cites this paper
Scalable bundling via dense product embeddings
2020cites this paper
Scalable Realistic Recommendation Datasets through Fractal Expansions
2019cites this paper
Context Aware Machine Learning
2019cites this paper
Predicting Consumption Patterns with Repeated and Novel Events
2019cites this paper
A Dynamic Embedding Model of the Media Landscape
2019cites this paper
Identification, Interpretability, and Bayesian Word Embeddings
2019cites this paper
Cost-Sensitive Parallel Learning Framework for Insurance Intelligence Operation
2019cites this paper
Parallax: Visualizing and Understanding the Semantics of Embedding Spaces via Algebraic Formulae
2019cites this paper
Scaling Up Collaborative Filtering Data Sets through Randomized Fractal Expansions
2019cites this paper
Training Temporal Word Embeddings with a Compass
2019influential citation
Representation Learning for Words and Entities
2019cites this paper
Complementary Recommendations: A Brief Survey
2019influential citation
Topic Modeling in Embedding Spaces
2019cites this paper
Apprentissage de plongements de mots dynamiques avec régularisation de la dérive (Learning dynamic word embeddings with drift regularisation)
2019cites this paper
Contextualized Diachronic Word Representations
2019cites this paper
Times Are Changing: Investigating the Pace of Language Change in Diachronic Word Embeddings
2019cites this paper
Interpretable Word Embeddings via Informative Priors
2019cites this paper
Empirical Study of Diachronic Word Embeddings for Scarce Data
2019cites this paper
Representing Words in Vector Space and Beyond
2019cites this paper
A Probabilistic Framework for Learning Domain Specific Hierarchical Word Embeddings
2019influential citation
Learning Node Embeddings with Exponential Family Distributions
2019cites this paper
Improving Universal Sound Separation Using Sound Classification
2019cites this paper
Exponential Family Graph Embeddings
2019cites this paper
Natural alpha embeddings
2019cites this paper
Quo Vadis, Math Information Retrieval
2019cites this paper
Low-dimensional statistical manifold embedding of directed graphs
2019cites this paper
Learning from SQL: Database Agnostic Workload Management
2019cites this paper
role2vec: Role-based Network Embeddings
2019cites this paper
Provable Gaussian Embedding with One Observation
2018cites this paper
Gaussian Word Embedding with a Wasserstein Distance Loss
2018cites this paper
Equation Embeddings
2018influential citation
End-to-End Learning for the Deep Multivariate Probit Model
2018cites this paper
use a Gaussian random walk to capture drift in the underlying language model ; for example
2018influential citation
Learning Role-based Graph Embeddings
2018cites this paper
Medical Concept Embedding with Time-Aware Attention
2018cites this paper
Sequences of Sets
2018cites this paper
Word2net: Deep Representations of Language
2018cites this paper
Query2Vec: An Evaluation of NLP Techniques for Generalized Workload Analytics
2018cites this paper
Context and Embeddings in Language Modelling - an Exploration
2018cites this paper
Lambert Matrix Factorization
2018cites this paper
Learning Effective Embeddings for Machine Generated Emails with Applications to Email Category Prediction
2018influential citation
Improving Word Embeddings by Emphasizing Co-hyponyms
2018cites this paper
A New Urban Functional Regions Minig Method with MPETM
2018cites this paper
Mining heterogeneous enterprise data
2018cites this paper
NPE: Neural Personalized Embedding for Collaborative Filtering
2018cites this paper
Scalable and Efficient Probabilistic Topic Model Inference for Textual Data
2018cites this paper
Query2Vec: NLP Meets Databases for Generalized Workload Analytics
2018cites this paper
A Discrete Choice Model for Subset Selection
2018cites this paper
Design and Analysis of Statistical Learning Algorithms which Control False Discoveries
2018cites this paper
Geographical Feature Extraction for Entities in Location-based Social Networks
2018cites this paper
Dynamic Embeddings for Language Evolution
2018cites this paper
Choosing to Grow a Graph: Modeling Network Formation as Discrete Choice
2018cites this paper
Machine Learning and Knowledge Discovery in Databases
2018influential citation
Continuous Word Embedding Fusion via Spectral Decomposition
2018cites this paper
The Geometry of Culture: Analyzing the Meanings of Class through Word Embeddings
2018influential citation
Conditional Word Embedding and Hypothesis Testing via Bayes-by-Backprop
2018cites this paper
Cost-sensitive Hybrid Neural Networks for Heterogeneous and Imbalanced Data
2018cites this paper
Inferring Complementary Products from Baskets and Browsing Sessions
2018cites this paper
Embeddings 2 . 0 : The Lexicon as Memory
2017cites this paper
Representation learning for relational data
2017cites this paper
Structured Embedding Models for Grouped Data
2017influential citation
Dynamic Word Embeddings via Skip-Gram Filtering
2017cites this paper
Dynamic Word Embeddings
2017cites this paper
Decoupling Homophily and Reciprocity with Latent Space Network Models
2017cites this paper
Nationality Classification Using Name Embeddings
2017cites this paper
Deep Probabilistic Programming
2017cites this paper