Approximate nonparametric maximum likelihood for mixture models: A convex optimization approach to fitting arbitrary multivariate mixing distributions

Published 2018 in Computational Statistics & Data Analysis

ABSTRACT

Abstract Nonparametric maximum likelihood (NPML) for mixture models is a technique for estimating mixing distributions that has a long and rich history in statistics going back to the 1950s, and is closely related to empirical Bayes methods. Historically, NPML-based methods have been considered to be relatively impractical because of computational and theoretical obstacles. However, recent work focusing on approximate NPML methods suggests that these methods may have great promise for a variety of modern applications. Building on this recent work, a class of flexible, scalable, and easy to implement approximate NPML methods is studied for problems with multivariate mixing distributions. Concrete guidance on implementing these methods is provided, with theoretical and empirical support; topics covered include identifying the support set of the mixing distribution, and comparing algorithms (across a variety of metrics) for solving the simple convex optimization problem at the core of the approximate NPML problem. Additionally, three diverse real data applications are studied to illustrate the methods’ performance: (i) A baseball data analysis (a classical example for empirical Bayes methods), (ii) high-dimensional microarray classification, and (iii) online prediction of blood-glucose density for diabetes patients. Among other things, the empirical results demonstrate the relative effectiveness of using multivariate (as opposed to univariate) mixing distributions for NPML-based approaches.

PUBLICATION RECORD

Publication year
2018
Venue
Computational Statistics & Data Analysis
Publication date
2018-06-01
Fields of study
Mathematics, Computer Science
Identifiers
DOI 10.1016/j.csda.2018.01.006
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

ディリクレ過程混合モデル（Dirichlet Process Mixture Model）
2020cited by this paper
Application of non-parametric empirical Bayes to treatment of non-response
2018cited by this paper
Rebayes: an R package for empirical bayes mixture methods
2017cited by this paper
Empirical Bayesball Remixed: Empirical Bayes Methods for Longitudinal Data
2017cited by this paper
On a Problem of Robbins
2016cited by this paper
High-dimensional classification via nonparametric empirical Bayes and maximum likelihood inference
2016influential reference
Unobserved Heterogeneity in Income Dynamics: An Empirical Bayes Perspective
2014cited by this paper
Two modeling strategies for empirical Bayes estimation.
2014cited by this paper
Nonparametric empirical Bayes and maximum likelihood estimation for high-dimensional data analysis
2014cited by this paper
Achieving Bayes MMSE performance in the sparse signal + Gaussian white noise model when the noise level is unknown
2013cited by this paper
Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization
2013cited by this paper
CONTINUOUS BLOOD GLUCOSE MONITORING: A BAYES-HIDDEN MARKOV APPROACH
2013influential reference
CONVEX OPTIMIZATION, SHAPE CONSTRAINTS, COMPOUND DECISIONS, AND EMPIRICAL BAYES RULES
2013cited by this paper
Nonparametric multivariate density estimation using mixtures
2013cited by this paper
The MicroArray Quality Control ( MAQC )-II study of common practices for the development and validation of microarray-based predictive models
2012cited by this paper
SURE Estimates for a Heteroscedastic Hierarchical Model
2012influential reference
A direct approach to sparse discriminant analysis in ultra-high dimensions
2012cited by this paper
Tweedie’s Formula and Selection Bias
2011cited by this paper
An empirical Bayes mixture method for effect size and false discovery rate estimation
2010influential reference
The Poisson Compound Decision Problem Revisited
2010cited by this paper
Large-scale inference
2010cited by this paper
Empirical Bayes in-season prediction of baseball batting averages
2010cited by this paper
The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models
2010cited by this paper
Fast nonparametric estimation of a mixing distribution with application to high-dimensional inference
2009cited by this paper
NONPARAMETRIC EMPIRICAL BAYES AND COMPOUND DECISION APPROACHES TO ESTIMATION OF A HIGH-DIMENSIONAL VECTOR OF NORMAL MEANS
2009cited by this paper
General maximum likelihood empirical Bayes estimation of normal means
2009cited by this paper
Application of Non Parametric Empirical Bayes Estimation to High Dimensional Classification
2009influential reference
Asymptotic Theory of Statistics and Probability
2008influential reference
In-season prediction of batting averages: A field test of empirical Bayes and Bayes methodologies
2008influential reference
Sensor-Augmented Insulin Pump Therapy: Results of the First Randomized Treat-to-Target Study
2008cited by this paper
Variational inference for Dirichlet process mixtures
2006cited by this paper
Nonparametric empirical Bayes for the Dirichlet process mixture model
2006cited by this paper
INADMISSIBILITY OF THE USUAL ESTIMATOR FOR THE MEAN OF A MULTIVARIATE NORMAL DISTRIBUTION
2005cited by this paper
Mirror descent and nonlinear projected subgradient methods for convex optimization
2003cited by this paper
Compound decision theory and empirical bayes methods
2003cited by this paper
Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities
2001cited by this paper
Finite Mixture Models
2000cited by this paper
Markov Chain Sampling Methods for Dirichlet Process Mixture Models
2000cited by this paper
Mixture models : theory, geometry, and applications
1995influential reference
A review of reliable maximum likelihood algorithms for semiparametric mixture models
1995cited by this paper
De-noising by soft-thresholding
1995cited by this paper
More Aspects of Polya Tree Distributions for Statistical Modelling
1992cited by this paper
Computer-assisted analysis of mixtures (C.A.MAN) statistical algorithms
1992cited by this paper
Polya Trees and Random Distributions
1992cited by this paper
An Algorithm for Computing the Nonparametric MLE of a Mixing Distribution
1992cited by this paper
Optimal Rates of Convergence for Deconvolving a Density
1988cited by this paper
Properties of the Maximum Likelihood Estimator of a Mixing Distribution
1981cited by this paper
Nonparametric Maximum Likelihood Estimation of a Mixing Distribution
1978influential reference
Data Analysis Using Stein's Estimator and its Generalizations
1975cited by this paper
Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems
1974cited by this paper
A Bayesian Analysis of Some Nonparametric Problems
1973cited by this paper
The Empirical Bayes Approach to Statistical Decision Problems
1964cited by this paper
CONSISTENCY OF THE MAXIMUM LIKELIHOOD ESTIMATOR IN THE PRESENCE OF INFINITELY MANY INCIDENTAL PARAMETERS
1956cited by this paper
An algorithm for quadratic programming
1956cited by this paper

CITED BY

TAN-FGBMLE: Tree-Augmented Naive Bayes Structure Learning Based on Fast Generative Bootstrap Maximum Likelihood Estimation for Continuous-Variable Classification
2025cites this paper
Nonparametric MLE for Gaussian Location Mixtures: Certified Computation and Generic Behavior
2025cites this paper
Model-free Estimation of Latent Structure via Multiscale Nonparametric Maximum Likelihood
2024cites this paper
Neural-g: A Deep Learning Framework for Mixing Density Estimation
2024influential citation
Fast Bootstrapping Nonparametric Maximum Likelihood for Latent Mixture Models
2024cites this paper
A Mean Field Approach to Empirical Bayes Estimation in High-dimensional Linear Regression
2023cites this paper
Generalized maximum likelihood estimation of the mean of parameters of mixtures. With applications to sampling and to observational studies
2022cites this paper
Guidance on When to Estimate a Future Price Factor: Development of Criteria and Thresholds
2022cites this paper
Generalized maximum likelihood estimation of the mean of parameters of mixtures, with applications to sampling
2021cites this paper
A nonparametric empirical Bayes approach to large-scale multivariate regression
2021cites this paper
A Regression Modeling Approach to Structured Shrinkage Estimation
2021influential citation
Improved nonparametric penalized maximum likelihood estimation for arbitrarily censored survival data
2021cites this paper
Uniform consistency in nonparametric mixture models
2021influential citation
Multivariate, heteroscedastic empirical Bayes via nonparametric maximum likelihood
2021cites this paper
High-dimensional linear discriminant analysis using nonparametric methods
2021cites this paper
High‐dimensional classification based on nonparametric maximum likelihood estimation under unknown and inhomogeneous variances
2021cites this paper
A Conditional Gradient Approach for Nonparametric Estimation of Mixing Distributions
2020cites this paper
A nonparametric empirical Bayes approach to covariance matrix estimation
2020cites this paper
Likelihood Maximization and Moment Matching in Low SNR Gaussian Mixture Models
2020cites this paper
A compound decision approach to covariance matrix estimation
2020cites this paper
Generalized Maximum Likelihood Estimators and their applications to stratified sampling and post-stratification with many unobserved strata.
2019cites this paper
Applications of Generalized Maximum Likelihood Estimators to stratified sampling and post-stratification with many unobserved strata
2019cites this paper
Simultaneous estimation of normal means with side information
2019influential citation
Rethinking Customer Segmentation and Demand Learning in the Presence of Sparse, Diverse, and Large-scale Data
2018cites this paper
Powerful genome-wide design and robust statistical inference in two-sample summary-data Mendelian randomization
2018cites this paper