Several methods for variable selection have been proposed in model-based clustering and classification. These make use of backward or forward procedures to define the roles of the variables. Unfortunately, such stepwise procedures are slow and the resulting algorithms inefficient when analyzing large data sets with many variables. In this paper, we propose an alternative regularization approach for variable selection in model-based clustering and classification. In our approach the variables are first ranked using a lasso-like procedure in order to avoid slow stepwise algorithms. Thus, the variable selection methodology of Maugis et al. (Comput Stat Data Anal 53:3872–3882, 2000b) can be efficiently applied to high-dimensional data sets.
Variable selection in model-based clustering and discriminant analysis with a regularization approach
G. Celeux,C. Maugis-Rabusseau,M. Sedki
Published 2017 in Advances in Data Analysis and Classification
ABSTRACT
PUBLICATION RECORD
- Publication year
2017
- Venue
Advances in Data Analysis and Classification
- Publication date
2017-05-02
- Fields of study
Mathematics, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-33 of 33 references · Page 1 of 1
CITED BY
Showing 1-43 of 43 citing papers · Page 1 of 1