Enhancing Machine Learning in Abusive Language Detection with Dataset Integration

Samaneh Hosseini Moghaddam,Kelly Lyons,Cheryl Regehr,Frank Rudzicz,V. Goel,Kaitlyn Regehr

Published 2025 in Conference of the Centre for Advanced Studies on Collaborative Research

ABSTRACT

Abusive language detection models are widely reported to suffer from poor generalization, limiting their realworld effectiveness. This is largely due to sampling and lexical biases in datasets. In response to these issues, we aim to enhance the generalizability of abusive language detection models by leveraging and unifying existing datasets. We harmonize ten publicly available datasets under a consistent definition of abusive language and integrate them into a single dataset. Our core hypothesis is that while individual datasets exhibit sampling bias, their complementary characteristics can be harnessed to create a broader and more representative training distribution. To evaluate this hypothesis, we first empirically demonstrate the extent of sampling bias across datasets, then systematically integrate multiple datasets into an aggregated corpus and compare the classification performance of models trained on each individual dataset versus a model trained on the aggregated corpus using a held-out, uniformly sampled benchmark comprising data from all datasets. While the integrated model improves macro-F1 from 0.60 (average across single datasets) to 0.84. Furthermore, we quantify the contribution of each dataset to the integrated model's performance gains and its lexical dissimilarity relative to others, and find a strong correlation with a magnitude of 0.71. These findings suggest that integrating lexically diverse datasets exposes models to a broader spectrum of abuse-related language, mitigating dataset-specific sampling biases and enhancing model generalizability in real-world scenarios.

PUBLICATION RECORD

Publication year
2025
Venue
Conference of the Centre for Advanced Studies on Collaborative Research
Publication date
2025-11-10
Fields of study
Not labeled
Identifiers
DOI 10.1109/CASCON66301.2025.00020
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Towards a comprehensive taxonomy of online abusive language informed by machine leaning
2025cited by this paper
MetaHate: A Dataset for Unifying Efforts on Hate Speech Detection
2024cited by this paper
Hate speech detection: A comprehensive review of recent works
2024cited by this paper
What Did You Learn To Hate? A Topic-Oriented Analysis of Generalization in Hate Speech Detection
2023cited by this paper
Anatomy of Hate Speech Datasets: Composition Analysis and Cross-dataset Classification
2023cited by this paper
Generalizing Hate Speech Detection Using Multi-Task Learning: A Case Study of Political Public Figures
2022cited by this paper
Introducing the Gab Hate Corpus: defining and applying hate-based rhetoric to social media posts at scale
2022cited by this paper
Handling Bias in Toxic Speech Detection: A Survey
2022cited by this paper
The Measuring Hate Speech Corpus: Leveraging Rasch Measurement Theory for Data Perspectivism
2022cited by this paper
Bias and comparison framework for abusive language datasets
2021cited by this paper
Human-in-the-Loop for Data Collection: a Multi-Target Counter Narrative Dataset to Fight Online Hate Speech
2021cited by this paper
Introducing CAD: the Contextual Abuse Dataset
2021cited by this paper
Towards generalisable hate speech detection: a review on obstacles and solutions
2021influential reference
How well do hate speech, toxicity, abusive and offensive language classification models generalize across datasets?
2021cited by this paper
Learning from the Worst: Dynamically Generated Datasets to Improve Online Hate Detection
2021cited by this paper
Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-Based Hate
2021cited by this paper
HateCheck: Functional Tests for Hate Speech Detection Models
2020cited by this paper
BERTweet: A pre-trained language model for English Tweets
2020cited by this paper
I Feel Offended, Don’t Be Abusive! Implicit/Explicit Messages in Offensive and Abusive Language
2020cited by this paper
ETHOS: a multi-label hate speech detection dataset
2020cited by this paper
Angry by design: toxic communication and technical architectures
2020cited by this paper
Hate speech detection is not as easy as you may think: A closer look at model validation (extended version)
2020cited by this paper
HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection
2020cited by this paper
Directions in abusive language training data, a systematic review: Garbage in, garbage out
2020cited by this paper
Hate Speech Detection is Not as Easy as You May Think: A Closer Look at Model Validation
2019cited by this paper
Studying Generalisability across Abusive Language Detection Datasets
2019cited by this paper
Challenges and frontiers in abusive content detection
2019cited by this paper
Detection of Abusive Language: the Problem of Biased Datasets
2019influential reference
Merging Datasets for Aggressive Text Identification
2018cited by this paper
Sensitivity Analysis in Practice: A Guide to Assessing Scientific Models
2004cited by this paper
Sensitivity Analysis in Practice
2002cited by this paper

CITED BY

No citing papers are available for this paper.