Towards Understanding Generalization of Macro-AUC in Multi-label Learning

Published 2023 in International Conference on Machine Learning

ABSTRACT

Macro-AUC is the arithmetic mean of the class-wise AUCs in multi-label learning and is commonly used in practice. However, its theoretical understanding is far lacking. Toward solving it, we characterize the generalization properties of various learning algorithms based on the corresponding surrogate losses w.r.t. Macro-AUC. We theoretically identify a critical factor of the dataset affecting the generalization bounds: \emph{the label-wise class imbalance}. Our results on the imbalance-aware error bounds show that the widely-used univariate loss-based algorithm is more sensitive to the label-wise class imbalance than the proposed pairwise and reweighted loss-based ones, which probably implies its worse performance. Moreover, empirical results on various datasets corroborate our theory findings. To establish it, technically, we propose a new (and more general) McDiarmid-type concentration inequality, which may be of independent interest.

PUBLICATION RECORD

Publication year
2023
Venue
International Conference on Machine Learning
Publication date
2023-05-09
Fields of study
Mathematics, Computer Science
Identifiers
DOI 10.48550/arXiv.2305.05248 arXiv 2305.05248
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Generalization bounds for learning under graph-dependence: a survey
2022cited by this paper
Rethinking and Reweighting the Univariate Losses for Multi-Label Ranking: Consistency and Generalization
2021influential reference
A review of methods for imbalanced multi-label classification
2021influential reference
Foundations of Machine Learning
2021cited by this paper
Convex Calibrated Surrogates for the Multi-Label F-Measure
2020cited by this paper
"Multi-label classiﬁcation: do Hamming loss and subset accuracy really conﬂict with each other?"
2020influential reference
Multi-label classification: do Hamming loss and subset accuracy really conflict with each other?
2020influential reference
Asymmetric Loss For Multi-Label Classification
2020cited by this paper
Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets
2020cited by this paper
Multi-label optimal margin distribution machine
2019cited by this paper
Joint Ranking SVM and Binary Relevance with Robust Low-Rank Learning for Multi-Label Classification
2019cited by this paper
McDiarmid-Type Inequalities for Graph-Dependent Variables and Stability Bounds
2019cited by this paper
Multilabel reductions: what is my loss optimising?
2019cited by this paper
Consistency Analysis for Binary Classification Revisited
2017cited by this paper
A Unified View of Multi-Label Performance Measures
2016cited by this paper
Barzilai-Borwein Step Size for Stochastic Gradient Descent
2016cited by this paper
Consistent Multilabel Classification
2015cited by this paper
Learning with Partially Labeled and Interdependent Data
2015influential reference
Towards Class-Imbalance Aware Multi-Label Learning
2015cited by this paper
A Review on Multi-Label Learning Algorithms
2014cited by this paper
On the bayes-optimality of F-measure maximizers
2013influential reference
Concentration Inequalities: A Nonasymptotic Theory of Independence
2013cited by this paper
Optimizing F-Measures : A Tale of Two Approaches
2012cited by this paper
Consistent multilabel ranking through univariate loss minimization
2012influential reference
On the Consistency of AUC Pairwise Optimization
2012cited by this paper
Bipartite Ranking through Minimization of Univariate Loss
2011influential reference
Large Scale Max-Margin Multi-Label Classification with Priors
2010cited by this paper
Computational Complexity: A Modern Approach
2009cited by this paper
Supervised Learning of Semantic Classes for Image Annotation and Retrieval
2007cited by this paper
Generalization error bounds for classifiers trained with interdependent data
2005influential reference
Local Rademacher complexities
2005cited by this paper
Large deviations for sums of partly dependent random variables
2004influential reference
Learning multi-label scene classification
2004influential reference
A kernel method for multi-labelled classification
2001cited by this paper
BoosTexter: A Boosting-based System for Text Categorization
2000cited by this paper
Neural Network Learning: Theoretical Foundations
1999cited by this paper
Surveys in Combinatorics, 1989: On the method of bounded differences
1989cited by this paper

CITED BY

MetaLink: An imbalanced multi-label classification approach for automatic tongue diagnosis
2026cites this paper
Anomaly-resilient geofencing and predictive navigation in IoT environments using machine learning and federated learning for metaverse workplaces and smart shopping malls
2026cites this paper
MAR-GCNet: Multi-label abnormal detection of electrocardiograms by combining multiscale features and graph convolutional networks
2026cites this paper
Evidential Mixture Machines: Deciphering Multi-Label Correlations for Active Learning Sensitivity
2024cites this paper
Top-K Pairwise Ranking: Bridging the Gap Among Ranking-Based Measures for Multi-label Classification
2024cites this paper
Generalization Analysis for Label-Specific Representation Learning
2024cites this paper
Multi-Label Learning with Stronger Consistency Guarantees
2024cites this paper
Exploiting Meta-Learned Confidences for Imbalanced Multilabel Learning
2024influential citation