Model based clustering for mixed data: clustMD

Published 2015 in Advances in Data Analysis and Classification

ABSTRACT

A model based clustering procedure for data of mixed type, clustMD, is developed using a latent variable model. It is proposed that a latent variable, following a mixture of Gaussian distributions, generates the observed data of mixed type. The observed data may be any combination of continuous, binary, ordinal or nominal variables. clustMD employs a parsimonious covariance structure for the latent variables, leading to a suite of six clustering models that vary in complexity and provide an elegant and unified approach to clustering mixed data. An expectation maximisation (EM) algorithm is used to estimate clustMD; in the presence of nominal data a Monte Carlo EM algorithm is required. The clustMD model is illustrated by clustering simulated mixed type data and prostate cancer patients, on whom mixed data have been recorded.

PUBLICATION RECORD

Publication year
2015
Venue
Advances in Data Analysis and Classification
Publication date
2015-11-05
Fields of study
Mathematics, Computer Science
Identifiers
DOI 10.1007/s11634-016-0238-x arXiv 1511.01720
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Model-based clustering of Gaussian copulas for mixed data
2014cited by this paper
CLUSTERING SOUTH AFRICAN HOUSEHOLDS BASED ON THEIR ASSET STATUS USING LATENT VARIABLE MODELS.
2014cited by this paper
R: A language and environment for statistical computing.
2014cited by this paper
Model-based clustering using copulas with applications
2014cited by this paper
Mixture of latent trait analyzers for model-based clustering of categorical data
2013cited by this paper
Algorithms from and for Nature and Life - Classification and Data Analysis
2013cited by this paper
A semiparametric approach to mixed outcome latent variable models: Estimating the association between cognition and regional brain volumes
2013cited by this paper
Clustering Ordinal Data via Latent Variable Models
2013influential reference
A latent variables approach for clustering mixed binary and continuous variables within a Gaussian mixture model
2012cited by this paper
Model-based clustering, classification, and discriminant analysis of data with mixed type
2012cited by this paper
mclust Version 4 for R : Normal Mixture Modeling for Model-Based Clustering , Classification , and Density Estimation
2012cited by this paper
Computational aspects of fitting mixture models via the expectation-maximization algorithm
2012cited by this paper
A mixture of generalized latent variable models for mixed mode and heterogeneous data
2011cited by this paper
Bayesian Gaussian Copula Factor Models for Mixed Data
2011cited by this paper
Bayesian Item Response Modeling
2010cited by this paper
A factor mixture analysis model for multivariate binary data
2010cited by this paper
Model-based clustering with non-elliptically contoured distributions
2009cited by this paper
The EM Algorithm and Extensions: Second Edition
2008cited by this paper
Finite Mixture and Markov Switching Models
2006cited by this paper
Bayesian Factor Analysis for Mixed Ordinal and Continuous Responses
2004cited by this paper
Application of rough sets in the presumptive diagnosis of urinary system diseases
2003cited by this paper
Model-Based Clustering, Discriminant Analysis, and Density Estimation
2002cited by this paper
Ordinal Data Modeling
2000cited by this paper
Finite Mixture Models
2000cited by this paper
Application of LDA to speaker recognition
2000cited by this paper
Finite Mixture Modeling with Mixture Outcomes Using the EM Algorithm
1999cited by this paper
Theory & Methods: Mixture model clustering using the MULTIMIX program
1999cited by this paper
Identifiable finite mixtures of location models for clustering mixed-mode data
1999cited by this paper
Robust Cluster Analysis via Mixtures of Multivariate t-Distributions
1998cited by this paper
Mixture separation for mixed-mode data
1996cited by this paper
The EM algorithm and extensions
1996cited by this paper
Gaussian parsimonious clustering models
1995cited by this paper
Bayes factors
1995cited by this paper
Alternative computational approaches to inference in the multinomial probit model
1994influential reference
Model-based Gaussian and non-Gaussian clustering
1993cited by this paper
A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms
1990cited by this paper
A finite mixture model for the clustering of mixed-mode data
1988cited by this paper
Statistical analysis of finite mixture distributions
1986cited by this paper
Data : a collection of problems from many fields for the student and research worker
1985cited by this paper
The choice of treatment for cancer patients based on covariate information.
1980cited by this paper
Estimating the Dimension of a Model
1978influential reference
Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper
1977cited by this paper

CITED BY

Multilevel Latent Class with CUB Models
2026cites this paper
PretopoMD: pretopology-based mixed data hierarchical clustering
2025cites this paper
MSKP: A Proposed K-Prototypes Clustering Algorithm for Multi-Source Medical Data
2025cites this paper
Designing unsupervised mixed‐type feature selection techniques using the heterogeneous correlation matrix
2025cites this paper
Adaptive Sparse Clustering of Mixed Data Using Azzalini-Encoded Ordinal Variables
2025cites this paper
Model-based Clustering Using Adjacent-Categories Logit Models via Finite-Mixtures for Ordinal Data
2025cites this paper
Modal Clustering for Categorical Data
2025cites this paper
Clustering Variables in Simple Nested Piecewise Linear Regression of the First Type
2025cites this paper
MMM: Clustering Multivariate Longitudinal Mixed-type Data
2025cites this paper
Bayesian Variational Inference for Mixed Data Mixture Models
2025cites this paper
A mixture model for skewed mixed-type data
2025cites this paper
Ordinal Clustering with the flex-Scheme
2025cites this paper
Clustering Approaches for Mixed‐Type Data: A Comparative Study
2025cites this paper
Simrec: a similarity measure recommendation system for mixed data clustering algorithms
2025cites this paper
Model-Based Clustering of Multivariate Rating Data Accounting for Feeling and Uncertainty
2025cites this paper
Mixed Data Clustering Survey and Challenges
2025influential citation
Assessment of cyclists yielding to pedestrians at an unsignalized zebra crossing in Germany using drone video
2025cites this paper
Clustering longitudinal ordinal data via finite mixture of matrix-variate distributions
2024influential citation
Clustering large mixed-type data with ordinal variables
2024cites this paper
Spectral Clustering of Categorical and Mixed-Type Data via Extra Graph Nodes
2024cites this paper
A variable clustering approach for overdispersed high-dimensional count data using a copula-based mixture model
2024cites this paper
Changes in physiological arousal during an arithmetic task: profiles of elementary school students and their associations with mindset, task performance and math grade
2024cites this paper
Multi-modal mixed-type structural equation modeling with structured sparsity for subgroup discovery from heterogeneous health data
2024cites this paper
Mixed fuzzy C-means clustering
2024cites this paper
Variational Bayes latent class analysis for EHR-based phenotyping with large real-world data
2024cites this paper
Data-driven subclassification of ANCA-associated vasculitis: model-based clustering of a federated international cohort.
2024cites this paper
Sparse clustering for customer segmentation with high-dimensional mixed-type data
2024cites this paper
Bayesian clustering of mixed-type data with relevant variable identification
2024influential citation
An introduction and tutorial to model-based clustering in education via Gaussian mixture modelling
2023cites this paper
Variational Bayes latent class approach for EHR-based phenotyping with large real-world data
2023cites this paper
DyClee-N&C: a clustering algorithm for heterogeneous data based situation assessment
2023cites this paper
Explore Testing Performance and Learning Behaviors
2023cites this paper
Model-Based Clustering of Mixed Data With Sparse Dependence
2023influential citation
Clustering mixed-type data using a probabilistic distance algorithm
2022cites this paper
K-Prototype Algorithm for Clustering Large Data Sets with Categorical Values to Established Product Segmentation
2022cites this paper
Model Based Co-clustering of Mixed Numerical and Binary Data
2022cites this paper
Spectral Clustering of Mixed-Type Data
2021cites this paper
Multipartition clustering of mixed data with Bayesian networks
2021cites this paper
Device personalization for heterogeneous populations: leveraging physician expertise and national population data to identify medical device patient user groups
2021influential citation
A novel sparse model-based algorithm to cluster categorical data for improved health screening and public health promotion
2021cites this paper
Biomass Clusterization from a Regional Perspective: The Case of Lithuania
2021cites this paper
Big Data Clustering Techniques: Recent Advances and Survey
2021cites this paper
Clustering Mixed-Type Data: A Benchmark Study on KAMILA and K-Prototypes
2021cites this paper
Model-based clustering with missing not at random data
2021cites this paper
Cluster Analysis of Mixed and Missing Chronic Kidney Disease Data in KwaZulu-Natal Province, South Africa
2021cites this paper
A Novel Clustering Algorithm Based on DPC and PSO
2020cites this paper
Model-based co-clustering for mixed type data
2020cites this paper
Disentangling multiproblem behavior in male young adults: A cluster analysis
2020cites this paper
A Study on Efficient Clustering Techniques Involved in Dealing With Diverse Attribute Data
2020cites this paper
Automatic Determination of Clustering Centers for “Clustering by Fast Search and Find of Density Peaks”
2020cites this paper
Co-Clustering of Ordinal Data via Latent Continuous Random Variables and Not Missing at Random Entries
2020cites this paper
An ensemble clustering approach for topic discovery using implicit text segmentation
2020cites this paper
Virtual Modeling of User Populations and Formative Design Parameters
2020cites this paper
Clustering by Finding Average Density
2020cites this paper
Clustering of Big Data with Mixed Features
2020cites this paper
Model-Based Clustering
2020cites this paper
UPCommons Portal del coneixement obert de la UPC
2020cites this paper
Co-Clustering of ordinal data via latent continuous random variables and a classification EM algorithm
2019cites this paper
Gaussian-Based Visualization of Gaussian and Non-Gaussian-Based Clustering
2019cites this paper
Variable Selection
2019cites this paper
Model-based Clustering: Basic Ideas
2019cites this paper
Model-based Clustering with Covariates
2019cites this paper
Non-Gaussian Model-based Clustering
2019cites this paper
Preface
2019cites this paper
Discrete Data Clustering
2019cites this paper
Bibliography
2019cites this paper
Clustering of mixed-type data considering concept hierarchies: problem specification and algorithm
2019cites this paper
Model-Based Clustering and Classification for Data Science: With Applications in R
2019cites this paper
Clustering and prediction of electronic health record data from mental health patients in a Finnish healthcare environment
2019cites this paper
Network data
2019cites this paper
High-dimensional Data
2019cites this paper
Other Topics
2019cites this paper
Dealing with Difficulties
2019cites this paper
Model-Based Clustering and Classification for Data Science
2019cites this paper
Mixtures of general location model with factor analyzer covariance structure for clustering mixed type data
2019cites this paper
Introduction
2019cites this paper
Model-based Classification
2019cites this paper
Semi-supervised Clustering and Classification
2019cites this paper
Survey of State-of-the-Art Mixed Data Clustering Algorithms
2018cites this paper
Distance Metrics and Clustering Methods for Mixed‐type Data
2018cites this paper
Heart Disease Diagnosis via Nonparametric Mixture Models
2018cites this paper
Finite mixture biclustering of discrete type multivariate data
2018cites this paper
Rank-based approach for estimating correlations in mixed ordinal data
2018cites this paper
A Survey of Mixed Data Clustering Algorithms
2018influential citation
clustMixType: User-Friendly Clustering of Mixed-Type Data in R
2018cites this paper
Distance‐based clustering of mixed data
2018cites this paper
1 A Survey of Mixed Data Clustering Algorithms
2018cites this paper
Unifying data units and models in (co-)clustering
2018cites this paper
Clustering and variable selection in the presence of mixed variable types and missing data
2017cites this paper
Gaussian parsimonious clustering models with covariates and a noise component
2017cites this paper
Model‐based clustering for spatiotemporal data on air quality monitoring
2017cites this paper
Model-Based and Nonparametric Approaches to Clustering for Data Compression in Actuarial Applications
2017cites this paper
Automatic clustering based on density peak detection using generalized extreme value distribution
2017cites this paper
Co-clustering de données mixtes à base des modèles de mélange
2017cites this paper
Vine copulas for mixed data : multi-view clustering for mixed data beyond meta-Gaussian dependencies
2017cites this paper
Variable selection methods for model-based clustering
2017cites this paper
Parsimonious Model-Based Clustering with Covariates
2017cites this paper
Developing Accurate Early Warning Systems Via Data Analytics
2016cites this paper
Clustering high‐dimensional mixed data to uncover sub‐phenotypes: joint analysis of phenotypic and genotypic data
2016cites this paper
Flexible model-based clustering of mixed binary and continuous data: application to genetic regulation and cancer
2016cites this paper