Cost-Efficient Online Hyperparameter Optimization

Jingkang Wang,Mengye Ren,Ilija Bogunovic,Yuwen Xiong,R. Urtasun,M. Mackay,Paul Vicol,Jon Lorraine,D. Duvenaud,R. Grosse,Ting Chen,Simon Kornblith,Mohammad Norouzi

Published 2021 in arXiv.org

ABSTRACT

Recent work on hyperparameters optimization (HPO) has shown the possibility of training certain hyperparameters together with regular parameters. However, these online HPO algorithms still require running evaluation on a set of validation examples at each training step, steeply increasing the training cost. To decide when to query the validation loss, we model online HPO as a time-varying Bayesian optimization problem, on top of which we propose a novel \textit{costly feedback} setting to capture the concept of the query cost. Under this setting, standard algorithms are cost-inefficient as they evaluate on the validation set at every round. In contrast, the cost-efficient GP-UCB algorithm proposed in this paper queries the unknown function only when the model is less confident about current decisions. We evaluate our proposed algorithm by tuning hyperparameters online for VGG and ResNet on CIFAR-10 and ImageNet100. Our proposed online HPO algorithm reaches human expert-level performance within a single run of the experiment, while incurring only modest computational overhead compared to regular training.

PUBLICATION RECORD

Publication year
2021
Venue
arXiv.org
Publication date
2021-01-17
Fields of study
Mathematics, Computer Science
Identifiers
arXiv 2101.06590
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Implicit differentiation of Lasso-type models for hyperparameter optimization
2020cited by this paper
Time-varying Gaussian Process Bandit Optimization with Non-constant Evaluation Time
2020cited by this paper
A Simple Framework for Contrastive Learning of Visual Representations
2020cited by this paper
Self-Tuning Networks: Bilevel Optimization of Hyperparameters using Structured Best-Response Functions
2019influential reference
BoTorch: Programmable Bayesian Optimization in PyTorch
2019cited by this paper
Bayesian Optimization for Iterative Learning
2019cited by this paper
Contrastive Multiview Coding
2019cited by this paper
Optimizing Millions of Hyperparameters by Implicit Differentiation
2019cited by this paper
Bilevel Programming for Hyperparameter Optimization and Meta-Learning
2018cited by this paper
Stochastic Hyperparameter Optimization through Hypernetworks
2018influential reference
Adversarially Robust Optimization with Gaussian Processes
2018cited by this paper
Taking the Human Out of the Loop: A Review of Bayesian Optimization
2016cited by this paper
Hyperparameter optimization with approximate gradient
2016cited by this paper
Time-Varying Gaussian Process Bandit Optimization
2016influential reference
Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets
2016cited by this paper
Bayesian Optimization with a Finite Budget: An Approximate Dynamic Programming Approach
2016cited by this paper
Hyperband: Bandit-Based Configuration Evaluation for Hyperparameter Optimization
2016cited by this paper
Learning Curve Prediction with Bayesian Neural Networks
2016cited by this paper
HyperNetworks
2016cited by this paper
Scalable Bayesian Optimization Using Deep Neural Networks
2015influential reference
Non-stochastic Best Arm Identification and Hyperparameter Optimization
2015influential reference
Scalable Gradient-Based Tuning of Continuous Regularization Hyperparameters
2015cited by this paper
Deep Residual Learning for Image Recognition
2015cited by this paper
Freeze-Thaw Bayesian Optimization
2014cited by this paper
Random Search for Hyper-Parameter Optimization
2012cited by this paper
Generic Methods for Optimization-Based Modeling
2012cited by this paper
Practical Bayesian Optimization of Machine Learning Algorithms
2012cited by this paper
Parallelizing Exploration-Exploitation Tradeoffs with Gaussian Process Bandit Optimization
2012cited by this paper
Contextual Gaussian Process Bandit Optimization
2011cited by this paper
Algorithms for Hyper-Parameter Optimization
2011cited by this paper
Sequential Model-Based Optimization for General Algorithm Configuration
2011cited by this paper
Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting
2009influential reference
ImageNet: A large-scale hierarchical image database
2009cited by this paper
Learning Multiple Layers of Features from Tiny Images
2009cited by this paper
Finite-time Analysis of the Multiarmed Bandit Problem
2002influential reference
The Nonstochastic Multiarmed Bandit Problem
2002influential reference
Design and regularization of neural networks: the optimal use of a validation set
1996cited by this paper
Mean Shift, Mode Seeking, and Clustering
1995influential reference
On the limited memory BFGS method for large scale optimization
1989cited by this paper

CITED BY

Syndicated Bandits: A Framework for Auto Tuning Hyper-parameters in Contextual Bandit Algorithms
2021cites this paper
Genealogical Population-Based Training for Hyperparameter Optimization
2021cites this paper