Measuring Harmful Sentence Completion in Language Models for LGBTQIA+ Individuals

Debora Nozza,Federico Bianchi,Anne Lauscher,Dirk Hovy

Published 2022 in LTEDI

ABSTRACT

Current language technology is ubiquitous and directly influences individuals’ lives worldwide. Given the recent trend in AI on training and constantly releasing new and powerful large language models (LLMs), there is a need to assess their biases and potential concrete consequences. While some studies have highlighted the shortcomings of these models, there is only little on the negative impact of LLMs on LGBTQIA+ individuals. In this paper, we investigated a state-of-the-art template-based approach for measuring the harmfulness of English LLMs sentence completion when the subjects belong to the LGBTQIA+ community. Our findings show that, on average, the most likely LLM-generated completion is an identity attack 13% of the time. Our results raise serious concerns about the applicability of these models in production environments.

PUBLICATION RECORD

Publication year
2022
Venue
LTEDI
Publication date
Unknown publication date
Fields of study
Linguistics, Computer Science
Identifiers
DOI 10.18653/v1/2022.ltedi-1.4
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Pipelines for Social Bias Testing of Large Language Models
2022cited by this paper
Introducing the Gab Hate Corpus: defining and applying hate-based rhetoric to social media posts at scale
2022cited by this paper
Benchmarking Post-Hoc Interpretability Approaches for Transformer-based Misogyny Detection
2022cited by this paper
Nozza@LT-EDI-ACL2022: Ensemble Modeling for Homophobia and Transphobia Detection
2022cited by this paper
Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists
2022cited by this paper
Welcome to the Modern World of Pronouns: Identity-Inclusive Natural Language Processing beyond Gender
2022cited by this paper
On the Gap between Adoption and Understanding in NLP
2021cited by this paper
Exposing the limits of Zero-shot Cross-lingual Hate Speech Detection
2021cited by this paper
Probing Toxic Content in Large Pre-Trained Language Models
2021cited by this paper
Annotating Online Misogyny
2021cited by this paper
A Survey of Race, Racism, and Anti-Racism in NLP
2021cited by this paper
HONEST: Measuring Hurtful Sentence Completion in Language Models
2021influential reference
RedditBias: A Real-World Resource for Bias Evaluation and Debiasing of Conversational Language Models
2021cited by this paper
Working Notes of the Workshop Arabic Misogyny Identification (ArMI-2021)
2021cited by this paper
Language Invariant Properties in Natural Language Processing
2021cited by this paper
SWEAT: Scoring Polarization of Topics across Different Corpora
2021cited by this paper
Sustainable Modular Debiasing of Language Models
2021cited by this paper
Harms of Gender Exclusivity and Challenges in Non-Binary Representation in Language Technologies
2021cited by this paper
ETHOS: a multi-label hate speech detection dataset
2020cited by this paper
StereoSet: Measuring stereotypical bias in pretrained language models
2020cited by this paper
BERTweet: A pre-trained language model for English Tweets
2020cited by this paper
Language (Technology) is Power: A Critical Survey of “Bias” in NLP
2020cited by this paper
Profiling Italian Misogynist: An Empirical Study
2020cited by this paper
An Annotated Corpus for Sexism Detection in French Tweets
2020cited by this paper
CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models
2020cited by this paper
Towards Accurate and Reliable Energy Measurement of NLP Models
2020cited by this paper
AMI @ EVALITA2020: Automatic Misogyny Identification
2020cited by this paper
PoliTeam @ AMI: Improving Sentence Embedding Similarity with Misogyny Lexicons for Automatic Misogyny Identification in Italian Tweets
2020cited by this paper
HateCheck: Functional Tests for Hate Speech Detection Models
2020cited by this paper
Unintended Bias in Misogyny Detection
2019cited by this paper
A General Framework for Implicit and Explicit Debiasing of Distributional Word Vector Spaces
2019cited by this paper
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
2019cited by this paper
The Woman Worked as a Babysitter: On Biases in Language Generation
2019cited by this paper
On Measuring and Mitigating Biased Inferences of Word Embeddings
2019cited by this paper
Multilingual and Multi-Aspect Hate Speech Analysis
2019cited by this paper
RoBERTa: A Robustly Optimized BERT Pretraining Approach
2019cited by this paper
Mitigating Gender Bias in Natural Language Processing: Literature Review
2019cited by this paper
SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter
2019cited by this paper
Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings
2019cited by this paper
Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them
2019cited by this paper
Reducing Gender Bias in Abusive Language Detection
2018cited by this paper
Overview of the Evalita 2018 Task on Automatic Misogyny Identification (AMI)
2018cited by this paper
Hurtlex: A Multilingual Lexicon of Words to Hurt
2018influential reference
Deceiving Google's Perspective API Built for Detecting Toxic Comments
2017cited by this paper
Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings
2016cited by this paper
The Social Impact of Natural Language Processing
2016cited by this paper
GloVe: Global Vectors for Word Representation
2014cited by this paper
Natural-Language Processing
1982cited by this paper

CITED BY

QueerGen: How LLMs Reflect Societal Norms on Gender and Sexuality in Sentence Completion Tasks
2026cites this paper
Queer NLP: A Critical Survey on Literature Gaps, Biases and Trends
2026cites this paper
Amplifying Trans and Nonbinary Voices: A Community-Centred Harm Taxonomy for LLMs
2025cites this paper
Bias, Accuracy, and Trust: Gender-Diverse Perspectives on Large Language Models
2025cites this paper
Theories of "Sexuality" in Natural Language Processing Bias Research
2025cites this paper
Homophobia and transphobia span identification in low-resource languages
2025cites this paper
Dangerous Criminals and Beautiful Prostitutes? Investigating Harmful Representations in Dutch Language Models
2025cites this paper
HLU: Human Vs LLM Generated Text Detection Dataset for Urdu at Multiple Granularities
2025cites this paper
Asking an AI for salary negotiation advice is a matter of concern: Controlled experimental perturbation of ChatGPT for protected and non-protected group discrimination on a contextual task with no clear ground truth answers
2025cites this paper
Robust Bias Detection in MLMs and its Application to Human Trait Ratings
2025cites this paper
Assessing and alleviating state anxiety in large language models
2025cites this paper
Cascading Adversarial Bias from Injection to Distillation in Language Models
2025cites this paper
Delving into Multilingual Ethical Bias: The MSQAD with Statistical Hypothesis Tests for Large Language Models
2025cites this paper
Stars, Stripes, and Silicon: Unravelling the ChatGPT's All-American, Monochrome, Cis-centric Bias
2024cites this paper
Generative AI for Accessible and Inclusive Extended Reality
2024cites this paper
Gender Bias in Natural Language Processing and Computer Vision: A Comparative Survey
2024cites this paper
The Root Shapes the Fruit: On the Persistence of Gender-Exclusive Harms in Aligned Language Models
2024cites this paper
Countering Hateful and Offensive Speech Online - Open Challenges
2024cites this paper
LLaMandement: Large Language Models for Summarization of French Legislative Proposals
2024cites this paper
Your Large Language Model is Secretly a Fairness Proponent and You Should Prompt it Like One
2024cites this paper
FairBelief - Assessing Harmful Beliefs in Language Models
2024influential citation
SoK: Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency
2024cites this paper
QUEEREOTYPES: A Multi-Source Italian Corpus of Stereotypes towards LGBTQIA+ Community Members
2024cites this paper
QueerBench: Quantifying Discrimination in Language Models Toward Queer Identities
2024influential citation
Snarci at SemEval-2024 Task 4: Themis Model for Binary Classification of Memes
2024cites this paper
From ‘Showgirls’ to ‘Performers’: Fine-tuning with Gender-inclusive Language for Bias Reduction in LLMs
2024cites this paper
Cross-cultural challenges in generative AI: Addressing homophobia in diverse sociocultural contexts
2024cites this paper
The life cycle of large language models in education: A framework for understanding sources of bias
2024cites this paper
The Life Cycle of Large Language Models: A Review of Biases in Education
2024cites this paper
Whispering Experts: Neural Interventions for Toxicity Mitigation in Language Models
2024cites this paper
An Explainable Approach to Understanding Gender Stereotype Text
2024cites this paper
Detecting and Mitigating LGBTQIA+ Bias in Large Norwegian Language Models
2024cites this paper
Analysing Effects of Inducing Gender Bias in Language Models
2024cites this paper
On the Influence of Gender and Race in Romantic Relationship Prediction from Large Language Models
2024cites this paper
On Bias and Fairness in NLP: Investigating the Impact of Bias and Debiasing in Language Models on the Fairness of Toxicity Detection
2023cites this paper
Language Model Behavior: A Comprehensive Survey
2023cites this paper
Measuring Gender Bias in West Slavic Language Models
2023cites this paper
A Cross-Lingual Study of Homotransphobia on Twitter
2023cites this paper
Biases in Large Language Models: Origins, Inventory, and Discussion
2023cites this paper
“I’m fully who I am”: Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation
2023influential citation
Having Beer after Prayer? Measuring Cultural Bias in Large Language Models
2023cites this paper
Trade-Offs Between Fairness and Privacy in Language Modeling
2023cites this paper
What about “em”? How Commercial Machine Translation Fails to Handle (Neo-)Pronouns
2023cites this paper
Knowledge of cultural moral norms in large language models
2023influential citation
MilaNLP at SemEval-2023 Task 10: Ensembling Domain-Adapted and Regularized Pretrained Language Models for Robust Sexism Detection
2023cites this paper
Detection of homophobia and transphobia in YouTube comments
2023cites this paper
XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models
2023cites this paper
Systematic Offensive Stereotyping (SOS) Bias in Language Models
2023cites this paper
Thesis Distillation: Investigating The Impact of Bias in NLP Models on Hate Speech Detection
2023cites this paper
HODI at EVALITA 2023: Overview of the first Shared Task on Homotransphobia Detection in Italian
2023cites this paper
Knowledge-Grounded Target Group Language Recognition in Hate Speech
2023cites this paper
Investigating the Existence of "Secret Language" in Language Models
2023cites this paper
Privacy Preserving Large Language Models: ChatGPT Case Study Based Vision and Framework
2023cites this paper
The shaping of the narrative on migration: A corpus assisted quantitative discourse analysis of the impact of the divisive media framing of migrants in Korea
2023cites this paper
Social Bias Probing: Fairness Benchmarking for Language Models
2023cites this paper
Tokenization Matters: Navigating Data-Scarce Tokenization for Gender Inclusive Language Technologies
2023cites this paper
KnowledgeVIS: Interpreting Language Models by Comparing Fill-in-the-Blank Prompts
2023cites this paper
Are you talking to ['xem'] or ['x', 'em']? On Tokenization and Addressing Misgendering in LLMs with Pronoun Tokenization Parity
2023cites this paper
Nozza@LT-EDI-ACL2022: Ensemble Modeling for Homophobia and Transphobia Detection
2022influential citation
Measuring Harmful Representations in Scandinavian Language Models
2022influential citation
Guiding the Release of Safer E2E Conversational AI through Value Sensitive Design
2022cites this paper
How can we detect Homophobia and Transphobia? Experiments in a multilingual code-mixed setting for social media governance
2022cites this paper
Revisiting Queer Minorities in Lexicons
2022cites this paper
MilaNLP at SemEval-2022 Task 5: Using Perceiver IO for Detecting Misogynous Memes with Text and Image Modalities
2022cites this paper
Pipelines for Social Bias Testing of Large Language Models
2022cites this paper
Language Invariant Properties in Natural Language Processing
2021cites this paper
Challenges to Grassroots Organization Engagement with AI Policy
year unknowncites this paper
Out of Sight Out of Mind: Measuring Bias in Language Models Against Overlooked Marginalized Groups in Regional Contexts
year unknowncites this paper