Benchmarking of Clustering Validity Measures Revisited

Connor Simpson,Ricardo J. G. B. Campello,Elizabeth Stojanovski

Published 2025 in Statistical analysis and data mining

ABSTRACT

Validation plays a crucial role in the clustering process. Many different internal validity indices exist for the purpose of determining the best clustering solution(s) from a given collection of candidates, for example, as produced by different algorithms or different algorithm hyper‐parameters. In this study, we present a comprehensive benchmark study of 26 internal validity indices, which includes highly popular classic indices as well as more recently developed ones. We adopted an enhanced revision of the methodology presented in Vendramin et al. (2010), developed here to address several shortcomings of this previous work. This overall new approach consists of three complementary custom‐tailored evaluation sub‐methodologies, each of which has been designed to assess specific aspects of an index's behavior while preventing potential biases of the other sub‐methodologies. Each sub‐methodology features two complementary measures of performance, alongside mechanisms that allow for an in‐depth investigation of more complex behaviors of the internal validity indices under study. Additionally, a new collection of 16,177 datasets has been produced, paired with eight widely used clustering algorithms, for a wider applicability scope and representation of more diverse clustering scenarios.

PUBLICATION RECORD

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-79 of 79 references · Page 1 of 1

CITED BY

  • No citing papers are available for this paper.

Showing 0-0 of 0 citing papers · Page 1 of 1