Adversarial Bias: Data Poisoning Attacks on Fairness

Published 2025 in BigData Congress [Services Society]

ABSTRACT

With the growing adoption of AI and machine learning systems in real-world applications, ensuring their fairness has become increasingly critical. In this work, we first provide a theoretical analysis demonstrating that a simple adversarial poisoning strategy is sufficient to induce maximally unfair behavior in naive Bayes classifiers. Our key idea is to strategically inject a small fraction of carefully-crafted adversarial data points into the training set, biasing the model's decision boundary to disproportionately affect a protected group while preserving generalizable performance. To illustrate the practical effectiveness of our method, we conduct experiments across three benchmark datasets and models. We find that our attack significantly outperforms existing methods in degrading fairness metrics across multiple models and datasets, often achieving substantially higher levels of unfairness with a comparable or only slightly worse impact on accuracy. Notably, our method proves effective on a wide range of models, in contrast to prior work, demonstrating a robust and potent approach to compromising the fairness of machine learning systems.

PUBLICATION RECORD

Publication year
2025
Venue
BigData Congress [Services Society]
Publication date
2025-11-11
Fields of study
Computer Science
Identifiers
DOI 10.1109/BigData66926.2025.11401793 arXiv 2511.08331
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Deceptive Fairness Attacks on Graphs via Meta Learning
2023cited by this paper
Towards Poisoning Fair Representations
2023cited by this paper
Self-Supervised Fair Representation Learning without Demographics
2022cited by this paper
[Re] Exacerbating Algorithmic Bias through Fairness Attacks
2022cited by this paper
Poisoning Attacks on Fair Machine Learning
2021cited by this paper
Poisoning Attacks on Algorithmic Fairness
2020cited by this paper
Exacerbating Algorithmic Bias through Fairness Attacks
2020cited by this paper
A Survey on Bias and Fairness in Machine Learning
2019cited by this paper
AdaFair: Cumulative Fairness Adaptive Boosting
2019cited by this paper
Bias Mitigation Post-processing for Individual and Group Fairness
2018cited by this paper
A Reductions Approach to Fair Classification
2018cited by this paper
Multiaccuracy: Black-Box Post-Processing for Fairness in Classification
2018cited by this paper
Conscientious Classification: A Data Scientist's Guide to Discrimination-Aware Classification
2017cited by this paper
Equality of Opportunity in Supervised Learning
2016cited by this paper
Learning Fair Representations
2013cited by this paper
Fairness-Aware Classifier with Prejudice Remover Regularizer
2012cited by this paper
Data preprocessing techniques for classification without discrimination
2011cited by this paper
Three naive Bayes approaches for discrimination-free classification
2010cited by this paper
Discrimination Aware Decision Tree Learning
2010cited by this paper

CITED BY

No citing papers are available for this paper.