Bayesian Hierarchical Mixtures of Experts

Published 2002 in Conference on Uncertainty in Artificial Intelligence

ABSTRACT

The Hierarchical Mixture of Experts (HME) is a well-known tree-structured model for regression and classification, based on soft probabilistic splits of the input space. In its original formulation its parameters are determined by maximum likelihood, which is prone to severe overfitting, including singularities in the likelihood function. Furthermore the maximum likelihood framework offers no natural metric for optimizing the complexity and structure of the tree. Previous attempts to provide a Bayesian treatment of the HME model have relied either on local Gaussian representations based on the Laplace approximation, or have modified the model so that it represents the joint distribution of both input and output variables, which can be wasteful of resources if the goal is prediction. In this paper we describe a fully Bayesian treatment of the original HME model based on variational inference. By combining 'local' and 'global' variational methods we obtain a rigorous lower bound on the marginal probability of the data under the model. This bound is optimized during the training phase, and its resulting value can be used for model order selection. We present results using this approach for data sets describing robot arm kinematics.

PUBLICATION RECORD

Publication year
2002
Venue
Conference on Uncertainty in Artificial Intelligence
Publication date
2002-08-07
Fields of study
Mathematics, Computer Science
Identifiers
arXiv 1212.2447
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Bayesian Treed Generalized Linear Models
2003cited by this paper
Bayesian model search for mixture models based on optimizing variational bounds
2002cited by this paper
VIBES: A Variational Inference Engine for Bayesian Networks
2002cited by this paper
Bayesian parameter estimation via variational methods
2000cited by this paper
An Introduction to Variational Methods for Graphical Models
1999cited by this paper
Classification and Regression using Mixtures of Experts
1997influential reference
Bayesian Methods for Mixtures of Experts
1995cited by this paper
Hierarchical Mixtures of Experts and the EM Algorithm
1993cited by this paper
Neural Networks for Pattern Recognition
1993cited by this paper

CITED BY

CoCoAFusE: Beyond Mixtures of Experts via Model Fusion
2025cites this paper
Improving Routing in Sparse Mixture of Experts with Graph of Tokens
2025influential citation
Ensemble Methods
2025cites this paper
Basis Transformers for Multi-Task Tabular Regression
2025cites this paper
Efficient Mixture-of-Agents Serving via Tree-Structured Routing, Adaptive Pruning, and Dependency-Aware Prefill-Decode Overlap
2025cites this paper
Mixture of Experts for Decentralized Generative AI and Reinforcement Learning in Wireless Networks: A Comprehensive Survey
2025cites this paper
Gradient-free variational learning with conditional mixture networks
2024cites this paper
A Survey of Neural Trees: Co-Evolving Neural Networks and Decision Trees
2024cites this paper
Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of Experts
2024cites this paper
Multi-Path Routing for Conditional Information Gain Trellis Using Cross-Entropy Search and Reinforcement Learning
2024cites this paper
Information Maximizing Curriculum: A Curriculum-Based Approach for Training Mixtures of Experts
2023cites this paper
Gaussian Process-Gated Hierarchical Mixtures of Experts
2023cites this paper
Joint Probability Trees
2023cites this paper
MODELING MISSING AT RANDOM NEUROPSYCHOLOGICAL TEST SCORES USING A MIXTURE OF BINOMIAL PRODUCT EXPERTS.
2023cites this paper
Covariate-Assisted Bayesian Graph Learning for Heterogeneous Data
2023cites this paper
Bayesian compositional regression with microbiome features via variational inference
2023cites this paper
Hierarchical-Hyperplane Kernels for Actively Learning Gaussian Process Models of Nonstationary Systems
2023cites this paper
Variational Boosted Soft Trees
2023cites this paper
Offline Reinforcement Learning with Closed-Form Policy Improvement Operators
2022cites this paper
A Tree Perspective on Stick-Breaking Models in Covariate-Dependent Mixtures
2022cites this paper
A Survey of Neural Trees
2022cites this paper
Federated Learning With Privacy-Preserving Ensemble Attention Distillation
2022influential citation
Dive into Decision Trees and Forests: A Theoretical Demonstration
2021cites this paper
Cluster’s Number Free Bayes Prediction of General Framework on Mixture of Regression Models
2021cites this paper
Ensemble Attention Distillation for Privacy-Preserving Federated Learning
2021cites this paper
Same State, Different Task: Continual Reinforcement Learning without Interference
2021cites this paper
Bayesian hierarchical stacking
2021cites this paper
The illusion of a hedonic price function: Nonparametric interpretable segmentation for hedonic inference
2021cites this paper
Healing Products of Gaussian Processes
2021cites this paper
Convolutional Ordinal Regression Forest for Image Ordinal Estimation
2021cites this paper
Decision Machines: Interpreting Decision Tree as a Model Combination Method
2021cites this paper
Healing Products of Gaussian Process Experts
2020cites this paper
Deep Ordinal Regression Forests
2020cites this paper
A similarity-based Bayesian mixture-of-experts model
2020cites this paper
Uncertainty Quantification for Deep Context-Aware Mobile Activity Recognition and Unknown Context Discovery
2020cites this paper
Adapting computer vision models to limitations on input dimensionality and model complexity
2020cites this paper
Amortised Variational Inference for Hierarchical Mixture Models
2020cites this paper
A flexible probabilistic framework for large-margin mixture of experts
2019cites this paper
An Efficient Semi-Supervised Multi-label Classifier Capable of Handling Missing Labels
2019cites this paper
A mixture of experts approach to handle ambiguities in parameter identification problems in material modeling
2019cites this paper
Model Selection of Bayesian Hierarchical Mixture of Experts based on Variational Inference
2019influential citation
Hierarchical Routing Mixture of Experts
2019influential citation
MixLasso: Generalized Mixed Regression via Convex Atomic-Norm Regularization
2018cites this paper
Estimation and Mapping of Ship Air Wakes using RC Helicopters as a Sensing Platform
2018cites this paper
The Holy Trinity: Blending Statistics, Machine Learning and Discrete Choice, with Applications to Strategic Bicycle Planning
2018cites this paper
Sparse Bayesian Hierarchical Mixture of Experts and Variational Inference
2018cites this paper
Mixtures of Experts Models
2018cites this paper
Modèle de mélange et modèles linéaires généralisés, application aux données de co-infection (arbovirus & paludisme)
2018influential citation
Sublinear-Time Learning and Inference for High-Dimensional Models
2018cites this paper
Classification using hierarchical mixture of discriminative learners: How to achieve high scores with few resources?
2018cites this paper
Practical and theoretical aspects of mixture‐of‐experts modeling: An overview
2018cites this paper
When Gaussian Process Meets Big Data: A Review of Scalable GPs
2018cites this paper
Hierarchy of Alternating Specialists for Scene Recognition
2018cites this paper
Conditionally Conjugate Mean-Field Variational Bayes for Logistic Models
2017cites this paper
A probabilistic multi-label classifier with missing and noisy labels handling capability
2017cites this paper
Generalized Inverse Reinforcement Learning with Linearly Solvable MDP
2017cites this paper
Mixtures of Conditional Random Fields for Improved Structured Output Prediction
2017cites this paper
Machine Learning Meets Microeconomics: The Case of Decision Trees and Discrete Choice
2017cites this paper
Covariate dependent random measures with applications in biostatistics
2017cites this paper
Predictive coarse-graining
2016cites this paper
Online Learning with Bayesian Classification Trees
2016cites this paper
Homework 5 : Approximate Inference Deadline : 22 Khordad 12 : 00 1 : Variational Bayes Method for Univariate Normal Distribution
2016cites this paper
Robust mixture of experts modeling using the t distribution
2016influential citation
Modèles probabilistes formels pour problèmes cognitifs usuels
2016cites this paper
META-CLASSIFICATION FOR VARIABLE STARS
2016cites this paper
Extraction of Impact of Wind Turbulence on RC Helicopters Using Machine Learning
2016cites this paper
The dynamic random subgraph model for the clustering of evolving networks
2016cites this paper
An Efficient Large-scale Semi-supervised Multi-label Classifier Capable of Handling Missing labels
2016cites this paper
Skew-normal Mixture of Experts
2016cites this paper
Robust mixture of experts modeling using the skew $t$ distribution
2016cites this paper
Sparse conditional copula models for structured output regression
2016cites this paper
Causal Falling Rule Lists
2015cites this paper
Goodness of fit of logistic models for random graphs
2015cites this paper
A comparative review of variable selection techniques for covariate dependent Dirichlet process mixture models
2015cites this paper
Variable Selection for Covariate Dependent Dirichlet Process Mixtures of Regressions
2015cites this paper
Goodness of Fit of Logistic Regression Models for Random Graphs
2015cites this paper
Deep Neural Decision Forests
2015cites this paper
Discovering Compact and Informative Structures through Data Partitioning
2015cites this paper
Non-Normal Mixtures of Experts
2015influential citation
Sparse inverse covariance learning of conditional Gaussian mixtures for multiple-output regression
2015cites this paper
Interpretable per case weighted ensemble method for cancer associations
2014cites this paper
Supervised topic regression via experts
2014cites this paper
Bayesian Context-Dependent Learning for Anomaly Classification in Hyperspectral Imagery
2014cites this paper
Simultaneous Feature and Expert Selection within Mixture of Experts
2014cites this paper
Non-parametric Bayesian mixture of sparse regressions with application towards feature selection for statistical downscaling
2014cites this paper
Simultaneous Twin Kernel Learning Using Polynomial Transformations for Structured Prediction
2014cites this paper
Towards an Improved Ensemble Learning Model of Artificial Neural Networks: Lessons Learned on Using Randomized Numbers of Hidden Neurons
2014cites this paper
Activity recognition with android phone using mixture-of-experts co-trained with labeled and unlabeled data
2014cites this paper
Combination of Movement Primitives for Robotics Kombination von Movement Primitives in der Robotik
2014cites this paper
Learning phenotype densities conditional on many interacting predictors
2014influential citation
Embedded local feature selection within mixture of experts
2014cites this paper
Modellschätzvorrichtung, modellschätzverfahren und modellschätzprogramm
2014cites this paper
Model Selection in Overlapping Stochastic Block Models
2014cites this paper
Provable Tensor Methods for Learning Mixtures of Classifiers
2014cites this paper
Hierarchical latent variable model estimation device
2014cites this paper
A Bayesian model for identifying hierarchically organised states in neural population activity
2014cites this paper
A model-learner pattern for bayesian reasoning
2013influential citation
Learning Densities Conditional on Many Interacting Features
2013cites this paper
CATEGORY-LEVEL VISUAL OBJECT RECOGNITION USING NOVEL MACHINE LEARNING TECHNIQUES
2013cites this paper
Classification of Multi-dimensional Streaming Time Series by Weighting Each Classifier's Track Record
2013influential citation