Mixture model averaging for clustering

Published 2012 in Advances in Data Analysis and Classification

ABSTRACT

In mixture model-based clustering applications, it is common to fit several models from a family and report clustering results from only the ‘best’ one. In such circumstances, selection of this best model is achieved using a model selection criterion, most often the Bayesian information criterion. Rather than throw away all but the best model, we average multiple models that are in some sense close to the best one, thereby producing a weighted average of clustering results. Two (weighted) averaging approaches are considered: averaging component membership probabilities and averaging models. In both cases, Occam’s window is used to determine closeness to the best model and weights are computed within a Bayesian model averaging paradigm. In some cases, we need to merge components before averaging; we introduce a method for merging mixture components based on the adjusted Rand index. The effectiveness of our model-based clustering averaging approaches is illustrated using a family of Gaussian mixture models on real and simulated data.

PUBLICATION RECORD

Publication year
2012
Venue
Advances in Data Analysis and Classification
Publication date
2012-12-23
Fields of study
Mathematics, Computer Science
Identifiers
DOI 10.1007/s11634-014-0182-6 arXiv 1212.5760
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Bayesian Model Averaging: A Tutorial
2016cited by this paper
Multivariate data analysis as a discriminating method of the origin of wines
2015cited by this paper
On comparing partitions
2015cited by this paper
R: A language and environment for statistical computing.
2014cited by this paper
Mixtures of skew-t factor analyzers
2013cited by this paper
Parsimonious skew mixture models for model-based clustering and classification
2013cited by this paper
Estimating common principal components in high dimensions
2013cited by this paper
How to find an appropriate clustering for mixed‐type variables with application to socio‐economic stratification
2013cited by this paper
mclust Version 4 for R : Normal Mixture Modeling for Model-Based Clustering , Classification , and Density Estimation
2012cited by this paper
A LASSO-penalized BIC for mixture model selection
2012cited by this paper
Bayesian Mode Regression
2012cited by this paper
Mixtures of Shifted AsymmetricLaplace Distributions
2012cited by this paper
Mixtures of Shifted Asymmetric Laplace Distributions
2012cited by this paper
Model-Based Classification via Mixtures of Multivariate t-Factor Analyzers
2012cited by this paper
Model-based classification via mixtures of multivariate t-distributions
2011cited by this paper
On Model-Based Clustering, Classification, and Discriminant Analysis
2011cited by this paper
Mixtures of modified t-factor analyzers for model-based clustering, classification, and discriminant
2011cited by this paper
Extending mixtures of multivariate t-factor analyzers
2011cited by this paper
Model-based classification using latent Gaussian mixture models
2010cited by this paper
Bayesian profile regression with an application to the National Survey of Children's Health.
2010cited by this paper
Methods for merging Gaussian mixture components
2010cited by this paper
Combining Mixture Components for Clustering
2010cited by this paper
Model-based clustering of microarray expression data via latent Gaussian mixture models
2010cited by this paper
Serial and parallel implementations of model-based clustering via parsimonious Gaussian mixture models
2010cited by this paper
Finite mixture models and model-based clustering
2010cited by this paper
Variable Selection and Updating In Model-Based Discriminant Analysis for High Dimensional Data with Food Authenticity Applications.
2009cited by this paper
National Survey of Children's Health fi nds most Kansas children's teeth are in good condition, but minority children face dental disparities.
2009cited by this paper
Representing degree distributions, clustering, and homophily in social networks with latent cluster random effects models
2009cited by this paper
Parsimonious Gaussian mixture models
2008cited by this paper
MCLUST Version 3 for R: Normal Mixture Modeling and Model-Based Clustering †
2007cited by this paper
Iterated importance sampling in missing data problems
2006cited by this paper
High-dimensional data clustering
2006cited by this paper
Bayesian Inference for Gene Expression and Proteomics: Model-Based Clustering for Expression Data via a Dirichlet Process Mixture Model
2006cited by this paper
Generation of Random Clusters with Specified Degree of Separation
2006cited by this paper
10 Model-Based Clustering for Expression Data via a Dirichlet Process Mixture Model
2006cited by this paper
Bayesian inference for gene expression and proteomics
2006cited by this paper
Using unlabelled data to update classification rules with applications in food authenticity studies
2006cited by this paper
Combining multiple clusterings using evidence accumulation
2005cited by this paper
Clustering Based on a Multilayer Mixture Model
2005cited by this paper
A Tutorial on MM Algorithms
2004cited by this paper
Properties of the Hubert-Arabie adjusted Rand index.
2004cited by this paper
Model-Based Clustering for Image Segmentation and Large Datasets via Sampling
2004cited by this paper
Frequentist Model Average Estimators
2003cited by this paper
Assessment and pruning of hierarchical model based clustering
2003cited by this paper
Model-Based Clustering, Discriminant Analysis, and Density Estimation
2002cited by this paper
Nearest-Neighbor Variance Estimation (NNVE)
2002cited by this paper
Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions
2002cited by this paper
Expression Data
2001cited by this paper
Model selection for probabilistic clustering using cross-validated likelihood
2000cited by this paper
Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood
2000cited by this paper
Dealing with label switching in mixture models
2000cited by this paper
Finding Curvilinear Features in Spatial Point Patterns: Principal Curve Clustering with Noise
2000cited by this paper
Detecting features in spatial point processes with clutter via model-based clustering
1998cited by this paper
How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis
1998cited by this paper
Bayesian Mode Regression
1997cited by this paper
Computing Bayes Factors by Combining Simulation and Asymptotic Approximations
1997cited by this paper
A First Course in Multivariate Statistics
1997cited by this paper
Bayesian Model Averaging in Proportional Hazard Models: Assessing the Risk of a Stroke
1997cited by this paper
An entropy criterion for assessing the number of clusters in a mixture model
1996cited by this paper
Approximate Bayes factors and accounting for model uncertainty in generalised linear models
1996cited by this paper
Discriminant Analysis by Gaussian Mixtures
1996cited by this paper
Simultaneous Variable and Transformation Selection in Linear Regression
1995cited by this paper
Mixture models : theory, geometry, and applications
1995cited by this paper
Breast Cancer Diagnosis and Prognosis Via Linear Programming
1995cited by this paper
A Reference Bayesian Test for Nested Hypotheses and its Relationship to the Schwarz Criterion
1995cited by this paper
Bayes factors
1995cited by this paper
Accounting for Model Uncertainty in Survival Analysis Improves Predictive Performance
1995cited by this paper
Gaussian parsimonious clustering models
1995cited by this paper
Model Selection and Accounting for Model Uncertainty in Graphical Models Using Occam's Window
1994influential reference
Mixture-Model Cluster Analysis Using Model Selection Criteria and a New Informational Measure of Complexity
1994cited by this paper
Model-based Gaussian and non-Gaussian clustering
1993cited by this paper
Consistent estimation of a mixing distribution
1992cited by this paper
A Comparison of the Akaike and Schwarz Criteria for Selecting Model Order
1988cited by this paper
A Study of the Comparability of External Criteria for Hierarchical Cluster Analysis.
1986cited by this paper
The Dip Test of Unimodality
1985cited by this paper
Estimation of Allocation Rates in a Cluster Analysis Context
1985cited by this paper
Estimation and Hypothesis Testing in Finite Mixture Models
1985cited by this paper
Multi-sample cluster analysis using Akaike's Information Criterion
1984cited by this paper
Estimating the Dimension of a Model
1978cited by this paper
Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper
1977cited by this paper
A new look at the statistical model identification
1974cited by this paper
Max-imum Likelihood from Incomplete Data
1972cited by this paper
A Coefficient of Agreement for Nominal Scales
1960cited by this paper
THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS
1936cited by this paper

CITED BY

Double-Layer Conditional Mixture Model for Model-Based Clustering and Automatic Component Merging
2025cites this paper
Parsimonious Ultrametric Manly Mixture Models
2025cites this paper
Bayesian Model Averaging with Diffused Priors for Model-Based Clustering Under a Cluster Forests Architecture
2025cites this paper
Stacking Model‐Based Classifiers for Dealing With Multiple Sets of Noisy Labels
2025cites this paper
A hybridization of SKH and RKFCM clustering optimization algorithm for efficient moving object exploration
2021cites this paper
Group-Wise Shrinkage Estimation in Penalized Model-Based Clustering
2021cites this paper
Copula Averaging for Tail Dependence in Insurance Claims Data
2021cites this paper
A Novel Approach for Gaussian Mixture Model Clustering Based on Soft Computing Method
2021cites this paper
Group-wise shrinkage for multiclass Gaussian Graphical Models
2021cites this paper
Better than the best? Answers via model ensemble in density-based clustering
2020cites this paper
Mixture Model Clustering Using Variable Data Segmentation and Model Selection: A Case Study of Genetic Algorithm
2019cites this paper
Bivariate Gamma Mixture of Experts Models for Joint Insurance Claims Modeling.
2019cites this paper
A hybrid gray wolf and genetic whale optimization algorithm for efficient moving object analysis
2019cites this paper
How bettering the best? Answers via blending models and cluster formulations in density-based clustering
2019influential citation
Bayesian Model Averaging
2018cites this paper
A general framework for frequentist model averaging
2018cites this paper
WGC: Hybridization of exponential grey wolf optimizer with whale optimization for data clustering
2017cites this paper
Accurate phenotyping: Reconciling approaches through Bayesian model averaging
2017cites this paper
On Clustering: Mixture Model Averaging with the Generalized Hyperbolic Distribution
2017influential citation
Dimension Reduction in Clustering
2016cites this paper
Model-Based Clustering
2016cites this paper
Logical circuit design using orientations of clusters in multivariate data for decision making predictions: A data mining and artificial intelligence algorithm approach
2016cites this paper
Bayesian model averaging in model-based clustering and density estimation
2015cites this paper
Multivariate Response and Parsimony for Gaussian Cluster-Weighted Models
2014cites this paper
A mixture of generalized hyperbolic factor analyzers
2013cites this paper
Mixtures of skew-t factor analyzers
2013cites this paper
Comparing Approaches to Initializing the Expectation-Maximization Algorithm
2012cites this paper