On Measuring and Mitigating Biased Inferences of Word Embeddings

Sunipa Dev,Tao Li,J. M. Phillips,Vivek Srikumar

Published 2019 in AAAI Conference on Artificial Intelligence

ABSTRACT

Word embeddings carry stereotypical connotations from the text they are trained on, which can lead to invalid inferences in downstream models that rely on them. We use this observation to design a mechanism for measuring stereotypes using the task of natural language inference. We demonstrate a reduction in invalid inferences via bias mitigation strategies on static word embeddings (GloVe). Further, we show that for gender bias, these techniques extend to contextualized embeddings when applied selectively only to the static components of contextualized embeddings (ELMo, BERT).

PUBLICATION RECORD

Publication year
2019
Venue
AAAI Conference on Artificial Intelligence
Publication date
2019-08-25
Fields of study
Computer Science
Identifiers
DOI 10.1609/AAAI.V34I05.6267 arXiv 1908.09369
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
2019cited by this paper
Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them
2019influential reference
Gender Bias in Contextualized Word Embeddings
2019cited by this paper
Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings
2019cited by this paper
Attenuating Bias in Word Vectors
2019influential reference
A Transparent Framework for Evaluating Unintended Demographic Bias in Word Embeddings
2019cited by this paper
Learning Gender-Neutral Word Embeddings
2018cited by this paper
Gender Bias in Coreference Resolution
2018cited by this paper
Deep Contextualized Word Representations
2018cited by this paper
Annotation Artifacts in Natural Language Inference Data
2018cited by this paper
Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints
2017cited by this paper
Social Bias in Elicited Natural Language Inferences
2017cited by this paper
Faculty Opinions recommendation of Semantics derived automatically from language corpora contain human-like biases.
2017cited by this paper
A Decomposable Attention Model for Natural Language Inference
2016influential reference
Long Short-Term Memory-Networks for Machine Reading
2016cited by this paper
Semantics derived automatically from language corpora contain human-like biases
2016cited by this paper
Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings
2016influential reference
Bidirectional Attention Flow for Machine Comprehension
2016cited by this paper
Recognizing Textual Entailment: Models and Applications Ido Dagan1, Dan Roth2, Mark Sammons2, and Fabio Massimo Zanzotto3 (1Bar-Ilan University, Israel, 2University of Illinois, Urbana, IL, and 3University of Rome “Tor Vergata,” Italy) Morgan & Claypool (Synthesis Lectures on Human Language Technolo
2015cited by this paper
A large annotated corpus for learning natural language inference
2015cited by this paper
GloVe: Global Vectors for Word Representation
2014cited by this paper
Distributed Representations of Words and Phrases and their Compositionality
2013cited by this paper
The PASCAL Recognising Textual Entailment Challenge
2005cited by this paper

CITED BY

IndicFairFace: Balanced Indian Face Dataset for Auditing and Mitigating Geographical Bias in Vision-Language Models
2026cites this paper
DeFrame: Debiasing Large Language Models Against Framing Effects
2026cites this paper
SAFARI: A Community-Engaged Approach and Dataset of Stereotype Resources in the Sub-Saharan African Context
2026influential citation
Bi-directional Bias Attribution: Debiasing Large Language Models without Modifying Prompts
2026cites this paper
A Unified Framework to Quantify Cultural Intelligence of AI
2026cites this paper
Detecting implicit biases of large language models with Bayesian hypothesis testing
2025cites this paper
Improving ethical sensitivity for ethical decision-making in conversational artificial intelligence
2025cites this paper
Analyzing the Safety of Japanese Large Language Models in Stereotype-Triggering Prompts
2025cites this paper
Language Models Predict Empathy Gaps Between Social In-groups and Out-groups
2025cites this paper
Structured Reasoning for Fairness: A Multi-Agent Approach to Bias Detection in Textual Data
2025cites this paper
Do LLMs exhibit demographic parity in responses to queries about Human Rights?
2025cites this paper
Textual Entailment is not a Better Bias Metric than Token Probability
2025influential citation
Exploring and Mitigating Gender Bias in Encoder-Based Transformer Models
2025cites this paper
Bias is a Math Problem, AI Bias is a Technical Problem: 10-year Literature Review of AI/LLM Bias Research Reveals Narrow [Gender-Centric] Conceptions of 'Bias', and Academia-Industry Gap
2025cites this paper
Investigating Intersectional Bias in Large Language Models using Confidence Disparities in Coreference Resolution
2025cites this paper
I Think, Therefore I Am Under-Qualified? A Benchmark for Evaluating Linguistic Shibboleth Detection in LLM Hiring Evaluations
2025cites this paper
EsCorpiusBias: The Contextual Annotation and Transformer-Based Detection of Racism and Sexism in Spanish Dialogue
2025cites this paper
Towards Fair Rankings: Leveraging LLMs for Gender Bias Detection and Measurement
2025cites this paper
Understanding Gender Bias in AI-Generated Product Descriptions
2025cites this paper
Can LLMs Solve My Grandma's Riddle? Evaluating Multilingual Large Language Models on Reasoning Traditional Bangla Tricky Riddles
2025cites this paper
Benchmarking and Pushing the Multi-Bias Elimination Boundary of LLMs via Causal Effect Estimation-guided Debiasing
2025cites this paper
Evaluate Bias without Manual Test Sets: A Concept Representation Perspective for LLMs
2025cites this paper
Assumed Identities: Quantifying Gender Bias in Machine Translation of Ambiguous Occupational Terms
2025cites this paper
PolBiX: Detecting LLMs' Political Bias in Fact-Checking through X-phemisms
2025cites this paper
Language Models That Walk the Talk: A Framework for Formal Fairness Certificates
2025cites this paper
GenderBench: Evaluation Suite for Gender Biases in LLMs
2025cites this paper
SHADES: Towards a Multilingual Assessment of Stereotypes in Large Language Models
2025influential citation
Exploring AI-Based Support in Speech-Language Pathology for Culturally and Linguistically Diverse Children
2025cites this paper
Deconstructing the ethics of large language models from long-standing issues to new-emerging dilemmas: a survey
2024cites this paper
Ask LLMs Directly, "What shapes your bias?": Measuring Social Bias in Large Language Models
2024cites this paper
ABLE: Agency-BeLiefs Embedding to Address Stereotypical Bias through Awareness Instead of Obliviousness
2024cites this paper
Recognizing Value Resonance with Resonance-Tuned RoBERTa Task Definition, Experimental Validation, and Robust Modeling
2024cites this paper
Large Language Model Bias Mitigation from the Perspective of Knowledge Editing
2024cites this paper
Hire Me or Not? Examining Language Model's Behavior with Occupation Attributes
2024cites this paper
“They only care to show us the wheelchair”: disability representation in text-to-image AI models
2024cites this paper
Believing Anthropomorphism: Examining the Role of Anthropomorphic Cues on Trust in Large Language Models
2024cites this paper
GeniL: A Multilingual Dataset on Generalizing Language
2024cites this paper
Fairness in Large Language Models: A Taxonomic Survey
2024cites this paper
Projective Methods for Mitigating Gender Bias in Pre-trained Language Models
2024cites this paper
Detecting Bias in Large Language Models: Fine-tuned KcBERT
2024cites this paper
Take Care of Your Prompt Bias! Investigating and Mitigating Prompt Bias in Factual Knowledge Extraction
2024cites this paper
Uncovering Stereotypes in Large Language Models: A Task Complexity-based Approach
2024cites this paper
SeeGULL Multilingual: a Dataset of Geo-Culturally Situated Stereotypes
2024cites this paper
“Flex Tape Can’t Fix That”: Bias and Misinformation in Edited Language Models
2024cites this paper
Improved BTM topic embedding method for Web text data extraction
2024cites this paper
A Note on Bias to Complete
2024cites this paper
From Prejudice to Parity: A New Approach to Debiasing Large Language Model Word Embeddings
2024cites this paper
Evaluating Unsupervised Dimensionality Reduction Methods for Pretrained Sentence Embeddings
2024influential citation
Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas
2024cites this paper
La recherche sur les biais dans les modèles de langue est biaisée : état de l’art en abyme [Bias Research for Language Models is Biased: a Survey for Deconstructing Bias in Large Language Models]
2024cites this paper
BenchmarkCards: Large Language Model and Risk Reporting
2024cites this paper
Measuring Machine Learning Harms from Stereotypes Requires Understanding Who Is Harmed by Which Errors in What Ways
2024cites this paper
MAFIA: Multi-Adapter Fused Inclusive Language Models
2024cites this paper
Stereohoax: a multilingual corpus of racial hoaxes and social media reactions annotated for stereotypes
2024cites this paper
Bias in Large Language Models: Origin, Evaluation, and Mitigation
2024cites this paper
BiasWipe: Mitigating Unintended Bias in Text Classifiers through Model Interpretability
2024cites this paper
Large Language Models Still Exhibit Bias in Long Text
2024cites this paper
LLMs are Biased Teachers: Evaluating LLM Bias in Personalized Education
2024cites this paper
BenchmarkCards: Standardized Documentation for Large Language Model Benchmarks
2024cites this paper
Measuring machine learning harms from stereotypes: requires understanding who is being harmed by which errors in what ways
2024cites this paper
Collapsed Language Models Promote Fairness
2024cites this paper
Extracting Features from Text Flows based on Semantic Similarity for Text Classification: an Approach Inspired by Audio Analysis
2024cites this paper
Leveraging Temporal Trends for Training Contextual Word Embeddings to Address Bias in Biomedical Applications: Development Study
2024cites this paper
Pro-Woman, Anti-Man? Identifying Gender Bias in Stance Detection
2024cites this paper
Debiasing Large Language Models with Structured Knowledge
2024cites this paper
Identifying and Mitigating Social Bias Knowledge in Language Models
2024cites this paper
Towards trustworthy LLMs: a review on debiasing and dehallucinating in large language models
2024cites this paper
Interpretations, Representations, and Stereotypes of Caste within Text-to-Image Generators
2024cites this paper
Fairness Definitions in Language Models Explained
2024influential citation
Do Generative AI Models Output Harm while Representing Non-Western Cultures: Evidence from A Community-Centered Approach
2024influential citation
Exploring Changes in Nation Perception with Nationality-Assigned Personas in LLMs
2024cites this paper
Evaluating Short-Term Temporal Fluctuations of Social Biases in Social Media Data and Masked Language Models
2024cites this paper
Exploring Safety-Utility Trade-Offs in Personalized Language Models
2024cites this paper
GECOBench: a gender-controlled text dataset and benchmark for quantifying biases in explanations
2024cites this paper
Debiasing with Sufficient Projection: A General Theoretical Framework for Vector Representations
2024cites this paper
A Survey on Fairness in Large Language Models
2023cites this paper
Nationality Bias in Text Generation
2023cites this paper
Counter-GAP: Counterfactual Bias Evaluation through Gendered Ambiguous Pronouns
2023cites this paper
BiasTestGPT: Using ChatGPT for Social Bias Testing of Language Models
2023cites this paper
A machine learning approach to identifying suicide risk among text-based crisis counseling encounters
2023cites this paper
A hunt for the Snark: Annotator Diversity in Data Practices
2023cites this paper
A Multilingual Dataset of Racial Stereotypes in Social Media Conversational Threads
2023cites this paper
Building Stereotype Repositories with Complementary Approaches for Scale and Depth
2023cites this paper
VERB: Visualizing and Interpreting Bias Mitigation Techniques Geometrically for Word Representations
2023influential citation
Large Language Model Programs
2023cites this paper
Biases in Large Language Models: Origins, Inventory, and Discussion
2023cites this paper
SeeGULL: A Stereotype Benchmark with Broad Geo-Cultural Coverage Leveraging Generative Models
2023cites this paper
Keeping Up with the Language Models: Robustness-Bias Interplay in NLI Data and Models
2023cites this paper
"I wouldn't say offensive but...": Disability-Centered Perspectives on Large Language Models
2023cites this paper
Sociodemographic Bias in Language Models: A Survey and Forward Path
2023cites this paper
Visual Exploration of Indirect Bias in Language Models
2023cites this paper
Prompt Tuning Pushes Farther, Contrastive Learning Pulls Closer: A Two-Stage Approach to Mitigate Social Biases
2023influential citation
Adjective Scale Probe: Can Language Models Encode Formal Semantics Information?
2023cites this paper
Unlearning Bias in Language Models by Partitioning Gradients
2023cites this paper
Gender-tuning: Empowering Fine-tuning for Debiasing Pre-trained Language Models
2023cites this paper
Mitigating Unintended Bias in Masked Language Models
2023cites this paper
Building Socio-culturally Inclusive Stereotype Resources with Community Engagement
2023cites this paper
Curriculum learning and evolutionary optimization into deep learning for text classification
2023cites this paper
CALM : A Multi-task Benchmark for Comprehensive Assessment of Language Model Bias
2023influential citation
Bias and Fairness in Large Language Models: A Survey
2023influential citation