Your Extreme Multi-label Classifier is Secretly a Hierarchical Text Classifier for Free

Nerijus Bertalis,Paul Granse,Ferhat Gul,Florian Hauss,Leon Menkel,D. Schuler,Tom Speier,Lukas Galke Poech,A. Scherp

Published 2024 in Unknown venue

ABSTRACT

Assigning a set of labels to a given text is a classification problem with many real-world applications, such as recommender systems. Two separate research streams address this issue. Hierarchical Text Classification (HTC) focuses on datasets with label pools of hundreds of entries, accompanied by a semantic label hierarchy. In contrast, eXtreme Multi-Label Text Classification (XML) considers very large sets of labels with up to millions of entries but without an explicit hierarchy. In XML methods, it is common to construct an artificial hierarchy in order to deal with the large label space before or during the training process. Here, we investigate how state-of-the-art HTC models perform when trained and tested on XML datasets and vice versa using three benchmark datasets from each of the two streams. Our results demonstrate that XML models, with their internally constructed hierarchy, are very effective HTC models. HTC models, on the other hand, are not equipped to handle the sheer label set size of XML datasets and achieve poor transfer results. We further argue that for a fair comparison in HTC and XML, more than one metric like F1 should be used but complemented with P@k and R-Precision.

PUBLICATION RECORD

Publication year
2024
Venue
Unknown venue
Publication date
2024-11-20
Fields of study
Computer Science
Identifiers
arXiv 2411.13687
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Revisiting Adversarial Training Under Long-Tailed Distributions
2024cited by this paper
HILL: Hierarchy-aware Information Lossless Contrastive Learning for Hierarchical Text Classification
2024cited by this paper
Hierarchy-Aware and Label Balanced Model for Hierarchical Text Classification
2024cited by this paper
Text Classification via Large Language Models
2023cited by this paper
Open, Closed, or Small Language Models for Text Classification?
2023cited by this paper
GPT-4 Technical Report
2023cited by this paper
Llama 2: Open Foundation and Fine-Tuned Chat Models
2023cited by this paper
Incorporating Hierarchy into Text Encoder: a Contrastive Learning Approach for Hierarchical Text Classification
2022influential reference
Are We Really Making Much Progress in Text Classification? A Comparative Review
2022cited by this paper
Exploiting Global and Local Hierarchies for Hierarchical Text Classification
2022cited by this paper
HPT: Hierarchy-aware Prompt Tuning for Hierarchical Text Classification
2022cited by this paper
A Survey on Text Classification Algorithms: From Text to Predictions
2022cited by this paper
s2s-ft: Fine-Tuning Pretrained Transformer Encoders for Sequence-to-Sequence Learning
2021cited by this paper
LightXML: Transformer with Dynamic Negative Sampling for High-Performance Extreme Multi-label Text Classification
2021cited by this paper
HTCInfoMax: A Global Model for Hierarchical Text Classification via Information Maximization
2021cited by this paper
Do Transformers Really Perform Bad for Graph Representation?
2021cited by this paper
Extreme Multi-label Learning for Semantic Matching in Product Search
2021cited by this paper
Hierarchy-aware Label Semantics Matching Network for Hierarchical Text Classification
2021cited by this paper
Fast Multi-Resolution Transformer Fine-tuning for Extreme Multi-label Text Classification
2021influential reference
InceptionXML: A Lightweight Framework with Synchronized Negative Sampling for Short Text Extreme Classification
2021cited by this paper
Hierarchy-Aware Global Model for Hierarchical Text Classification
2020influential reference
A Survey on Text Classification: From Shallow to Deep Learning
2020cited by this paper
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
2019cited by this paper
RoBERTa: A Robustly Optimized BERT Pretraining Approach
2019cited by this paper
XLNet: Generalized Autoregressive Pretraining for Language Understanding
2019cited by this paper
Slice: Scalable Linear Extreme Classifiers Trained on 100 Million Labels for Related Searches
2019cited by this paper
HFT-CNN: Learning Hierarchical Category Structure for Multi-label Short Text Categorization
2018cited by this paper
A no-regret generalization of hierarchical softmax to extreme multi-label classification
2018cited by this paper
PPDsparse: A Parallel Primal-Dual Sparse Method for Extreme Classification
2017cited by this paper
Graph Attention Networks
2017cited by this paper
HDLTex: Hierarchical Deep Learning for Text Classification
2017cited by this paper
Extreme Multi-label Loss Functions for Recommendation, Tagging, Ranking & Other Missing Label Applications
2016influential reference
DiSMEC: Distributed Sparse Machines for Extreme Multi-label Classification
2016cited by this paper
Hidden factors and hidden topics: understanding rating dimensions with review text
2013cited by this paper
Enhancing Navigation on Wikipedia with Social Tags
2012cited by this paper
Evaluation in Information Retrieval
2009influential reference
Large scale multi-label classification via metalabeler
2009cited by this paper
A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation
2005cited by this paper
RCV1: A New Benchmark Collection for Text Categorization Research
2004cited by this paper
Hierarchical text classification and evaluation
2001cited by this paper
Bidirectional recurrent neural networks
1997cited by this paper
Hierarchical Text Classification: A Review Of Current Research
year unknowncited by this paper

CITED BY

Cross-Dataset Analysis of Language Models for Generalised Multi-label Review Note Distribution in Animated Productions
2025cites this paper