Fair Bayesian Data Selection via Generalized Discrepancy Measures

Yixuan Zhang,Jiabin Luo,Zheng-G Wang,Feng Zhou,Quyu Kong

Published 2025 in arXiv.org

ABSTRACT

Fairness concerns are increasingly critical as machine learning models are deployed in high-stakes applications. While existing fairness-aware methods typically intervene at the model level, they often suffer from high computational costs, limited scalability, and poor generalization. To address these challenges, we propose a Bayesian data selection framework that ensures fairness by aligning group-specific posterior distributions of model parameters and sample weights with a shared central distribution. Our framework supports flexible alignment via various distributional discrepancy measures, including Wasserstein distance, maximum mean discrepancy, and $f$-divergence, allowing geometry-aware control without imposing explicit fairness constraints. This data-centric approach mitigates group-specific biases in training data and improves fairness in downstream tasks, with theoretical guarantees. Experiments on benchmark datasets show that our method consistently outperforms existing data selection and model-based fairness methods in both fairness and accuracy.

PUBLICATION RECORD

Publication year
2025
Venue
arXiv.org
Publication date
2025-11-10
Fields of study
Mathematics, Computer Science
Identifiers
DOI 10.48550/arXiv.2511.07032 arXiv 2511.07032
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Personalized Bayesian Federated Learning with Wasserstein Barycenter Aggregation
2025cited by this paper
A Bayesian Approach to Data Point Selection
2024influential reference
Post-Training Attribute Unlearning in Recommender Systems
2024cited by this paper
Adaptive Training Distributions with Scalable Online Bilevel Optimization
2023cited by this paper
Expert-level detection of pathologies from unannotated chest X-ray images via self-supervised learning
2022cited by this paper
A Stochastic Optimization Framework for Fair Risk Minimization
2021cited by this paper
LongReMix: Robust Learning with High Confidence Samples in a Noisy Label Environment
2021cited by this paper
Constructing a Fair Classifier with Generated Fair Data
2021cited by this paper
Learning Fast Sample Re-weighting Without Reward Data
2021cited by this paper
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
2020cited by this paper
FairBatch: Batch Selection for Model Fairness
2020influential reference
A General Approach to Fairness with Optimal Transport
2020cited by this paper
Learning the Stein Discrepancy for Training and Evaluating Energy-Based Models without Sampling
2020cited by this paper
Meta-Weight-Net: Learning an Explicit Mapping For Sample Weighting
2019cited by this paper
Unlocking Fairness: a Trade-off Revisited
2019cited by this paper
Confidence Scores Make Instance-dependent Label-noise Learning Possible
2019cited by this paper
Flexibly Fair Representation Learning by Disentanglement
2019cited by this paper
Empirical Risk Minimization under Fairness Constraints
2018cited by this paper
Obtaining Fairness using Optimal Transport Theory
2018cited by this paper
Understanding and Accelerating Particle-Based Variational Inference
2018cited by this paper
Mitigating Unwanted Biases with Adversarial Learning
2018cited by this paper
Not All Samples Are Created Equal: Deep Learning with Importance Sampling
2018cited by this paper
Learning What Data to Learn
2017cited by this paper
Optimized Pre-Processing for Discrimination Prevention
2017cited by this paper
Exploring or Exploiting? Social and Ethical Implications of Autonomous Experimentation in AI
2016cited by this paper
Densely Connected Convolutional Networks
2016cited by this paper
A statistical framework for fair predictive algorithms
2016cited by this paper
Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm
2016cited by this paper
Online Batch Selection for Faster Training of Neural Networks
2015cited by this paper
Deep Residual Learning for Image Recognition
2015cited by this paper
The Variational Fair Autoencoder
2015cited by this paper
Learning Fair Classifiers
2015cited by this paper
Fairness Constraints: Mechanisms for Fair Classification
2015cited by this paper
Learning Fair Representations
2013cited by this paper
Effective Unconstrained Face Recognition by Combining Multiple Descriptors and Learned Background Statistics
2011cited by this paper
Consumer Credit Risk Models Via Machine-Learning Algorithms
2010cited by this paper
Building Classifiers with Independency Constraints
2009cited by this paper
Attribute and simile classifiers for face verification
2009cited by this paper
Curriculum learning
2009cited by this paper

CITED BY

No citing papers are available for this paper.