LCDB 1.1: A Database Illustrating Learning Curves Are More Ill-Behaved Than Previously Thought

Published 2025 in arXiv.org

ABSTRACT

Sample-wise learning curves plot performance versus training set size. They are useful for studying scaling laws and speeding up hyperparameter tuning and model selection. Learning curves are often assumed to be well-behaved: monotone (i.e. improving with more data) and convex. By constructing the Learning Curves Database 1.1 (LCDB 1.1), a large-scale database with high-resolution learning curves including more modern learners (CatBoost, TabNet, RealMLP and TabPFN), we show that learning curves are less often well-behaved than previously thought. Using statistically rigorous methods, we observe significant ill-behavior in approximately 15% of the learning curves, almost twice as much as in previous estimates. We also identify which learners are to blame and show that specific learners are more ill-behaved than others. Additionally, we demonstrate that different feature scalings rarely resolve ill-behavior. We evaluate the impact of ill-behavior on downstream tasks, such as learning curve fitting and model selection, and find it poses significant challenges, underscoring the relevance and potential of LCDB 1.1 as a challenging benchmark for future research.

PUBLICATION RECORD

Publication year
2025
Venue
arXiv.org
Publication date
2025-05-21
Fields of study
Mathematics, Computer Science
Identifiers
DOI 10.48550/arXiv.2505.15657 arXiv 2505.15657
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

(Mis)Fitting: A Survey of Scaling Laws
2025cited by this paper
TabArena: A Living Benchmark for Machine Learning on Tabular Data
2025cited by this paper
OpenML: Insights from 10 years and more than a thousand papers
2025influential reference
Accurate predictions on small data with a tabular foundation model
2025influential reference
Why Tabular Foundation Models Should Be a Research Priority
2024cited by this paper
Better by Default: Strong Pre-Tuned MLPs and Boosted Trees on Tabular Data
2024influential reference
Naive Bayes Classifiers and One-hot Encoding of Categorical Variables
2024influential reference
In-Context Freeze-Thaw Bayesian Optimization for Hyperparameter Optimization
2024cited by this paper
Learning Curve Extrapolation Methods Across Extrapolation Settings
2024cited by this paper
Croissant: A Metadata Format for ML-Ready Datasets
2024influential reference
Efficient Bayesian Learning Curve Extrapolation using Prior-Data Fitted Networks
2023cited by this paper
When Do Neural Nets Outperform Boosted Trees on Tabular Data?
2023cited by this paper
Also for k-means: more data does not imply better performance
2023cited by this paper
MASIF: Meta-learned Algorithm Selection using Implicit Fidelity Information
2023cited by this paper
Why do tree-based models still outperform deep learning on typical tabular data?
2022influential reference
The choice of scaling technique matters for classification performance
2022cited by this paper
Game-theoretic statistics and safe anytime-valid inference
2022cited by this paper
Optimizing Data Collection for Machine Learning
2022influential reference
TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second
2022cited by this paper
LCDB 1.0: An Extensive Learning Curves Database for Classification Tasks
2022influential reference
Revisiting Neural Scaling Laws in Language and Vision
2022cited by this paper
The BUTTER Zone: An Empirical Study of Training Dynamics in Fully Connected Neural Networks
2022cited by this paper
How Much More Data Do I Need? Estimating Requirements for Downstream Tasks
2022cited by this paper
Dynamic and Efficient Gray-Box Hyperparameter Optimization for Deep Learning
2022cited by this paper
Learning curves for decision making in supervised machine learning: a survey
2022cited by this paper
Fast and Informative Model Selection Using Learning Curve Cross-Validation
2021cited by this paper
Deep Neural Networks and Tabular Data: A Survey
2021cited by this paper
Multi-task Learning Curve Forecasting Across Hyperparameter Configurations and Datasets
2021cited by this paper
The Shape of Learning Curves: A Review
2021influential reference
Using a thousand optimization tasks to learn hyperparameter search strategies
2020cited by this paper
Scaling Laws for Neural Language Models
2020cited by this paper
Institutional Review Board
2020cited by this paper
Learning Curves for Analysis of Deep Networks
2020cited by this paper
Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL
2020cited by this paper
A brief prehistory of double descent
2020cited by this paper
OpenML-Python: an extensible Python API for OpenML
2019cited by this paper
An Open Source AutoML Benchmark
2019influential reference
Open Problem: Monotonicity of Learning
2019cited by this paper
Minimizers of the Empirical Risk and Risk Monotonicity
2019cited by this paper
TabNet: Attentive Interpretable Tabular Learning
2019influential reference
A Constructive Prediction of the Generalization Error Across Scales
2019cited by this paper
Deep double descent: where bigger models and more data hurt
2019cited by this paper
Reconciling modern machine-learning practice and the classical bias–variance trade-off
2018cited by this paper
Learning from Learning Curves
2018cited by this paper
Deep Learning Scaling is Predictable, Empirically
2017cited by this paper
CatBoost: unbiased boosting with categorical features
2017influential reference
OpenML Benchmarking Suites
2017influential reference
Defying the gravity of learning curve: a characteristic of nearest neighbour anomaly detectors
2016cited by this paper
Deep Learning
2016cited by this paper
Learning Curve Prediction with Bayesian Neural Networks
2016cited by this paper
The Peaking Phenomenon in Semi-supervised Learning
2016cited by this paper
Speeding Up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves
2015cited by this paper
How much data is needed to train a medical image deep learning system to achieve necessary high accuracy
2015cited by this paper
Non-stochastic Best Arm Identification and Hyperparameter Optimization
2015cited by this paper
Efficient and Robust Automated Machine Learning
2015influential reference
ENDGAME ANALYSIS OF DOU SHOU QI
2014cited by this paper
Inconsistency of Bayesian Inference for Misspecified Linear Models, and a Proposal for Repairing It
2014cited by this paper
Freeze-Thaw Bayesian Optimization
2014cited by this paper
OpenML: networked science in machine learning
2014influential reference
Efficient BackProp
2012cited by this paper
The Dipping Phenomenon
2012influential reference
Scikit-learn: Machine Learning in Python
2011cited by this paper
Discovering the false discovery rate
2010cited by this paper
Learning from the Past with Experiment Databases
2008cited by this paper
Tackling the Poor Assumptions of Naive Bayes Text Classifiers
2003cited by this paper
The Elements of Statistical Learning
2003influential reference
Gaussian Process Regression with Mismatched Models
2001cited by this paper
Efficient progressive sampling
1999cited by this paper
Modeling decision tree performance with the power law
1999cited by this paper
Expected classification error of the Fisher linear classifier with pseudo-inverse covariance matrix
1998cited by this paper
Generalization in a two-layer neural network.
1993cited by this paper
Linear and Nonlinear Extension of the Pseudo-Inverse Solution for Learning Boolean Functions
1989cited by this paper
Towards Quantifying the Effect of Datasets for Benchmarking: A Look at Tabular Machine Learning
year unknowncited by this paper
From Epoch to Sample Size: Developing New Data-driven Priors for Learning Curve Prior-Fitted Networks
year unknowncited by this paper

CITED BY

Opleiding Informatica Building an experimental learning curve database from many configurations of the gradient boosting algorithm
year unknowninfluential citation