Practical Bayesian Optimization of Machine Learning Algorithms

Jasper Snoek,H. Larochelle,Ryan P. Adams

Published 2012 in Neural Information Processing Systems

ABSTRACT

The use of machine learning algorithms frequently involves careful tuning of learning parameters and model hyperparameters. Unfortunately, this tuning is often a "black art" requiring expert experience, rules of thumb, or sometimes brute-force search. There is therefore great appeal for automatic approaches that can optimize the performance of any given learning algorithm to the problem at hand. In this work, we consider this problem through the framework of Bayesian optimization, in which a learning algorithm's generalization performance is modeled as a sample from a Gaussian process (GP). We show that certain choices for the nature of the GP, such as the type of kernel and the treatment of its hyperparameters, can play a crucial role in obtaining a good optimizer that can achieve expertlevel performance. We describe new algorithms that take into account the variable cost (duration) of learning algorithm experiments and that can leverage the presence of multiple cores for parallel experimentation. We show that these proposed algorithms improve on previous automatic procedures and can reach or surpass human expert-level optimization for many algorithms including latent Dirichlet allocation, structured SVMs and convolutional neural networks.

PUBLICATION RECORD

Publication year
2012
Venue
Neural Information Processing Systems
Publication date
2012-06-13
Fields of study
Mathematics, Computer Science
Identifiers
arXiv 1206.2944
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

Proposed Bayesian optimization algorithms reach or surpass human expert-level optimization for latent Dirichlet allocation, structured SVMs, and convolutional neural networks.
Confidence 0.95

imjlk (vdp8mqzes2) extractionAnonymous (12632b8b5f) review
Algorithms accounting for variable cost and parallel experimentation improve on previous automatic procedures for Bayesian optimization.
Confidence 0.90

imjlk (vdp8mqzes2) extractionAnonymous (12632b8b5f) review
Choices for the nature of the Gaussian process, including kernel selection and hyperparameter treatment, are crucial for obtaining an optimizer that achieves expert-level performance.
Confidence 0.95

imjlk (vdp8mqzes2) extractionAnonymous (12632b8b5f) review

CONCEPTS

bayesian optimization
method, framework

A framework for optimizing the performance of learning algorithms by modeling generalization performance.

imjlk (vdp8mqzes2) extractionAnonymous (12632b8b5f) review
convolutional neural networks
algorithm, model

A class of deep neural networks used as a test case for evaluating optimization algorithms.

Aliases: CNNs

imjlk (vdp8mqzes2) extractionAnonymous (12632b8b5f) review
gaussian process
model, stochastic process

A stochastic process used to model a learning algorithm's generalization performance as a sample.

Aliases: GP

imjlk (vdp8mqzes2) extractionAnonymous (12632b8b5f) review
kernel
component, function

A function defining the covariance structure of the Gaussian process model used in optimization.

imjlk (vdp8mqzes2) extractionAnonymous (12632b8b5f) review
latent dirichlet allocation
algorithm, model

A generative statistical model used as a test case for evaluating optimization algorithms.

Aliases: LDA

imjlk (vdp8mqzes2) extractionAnonymous (12632b8b5f) review
parallel experimentation
method, approach

A method of leveraging multiple cores to run multiple learning algorithm experiments simultaneously.

imjlk (vdp8mqzes2) extractionAnonymous (12632b8b5f) review
structured svms
algorithm, model

A support vector machine variant used as a test case for evaluating optimization algorithms.

Aliases: structuredSupport Vector Machines

imjlk (vdp8mqzes2) extractionAnonymous (12632b8b5f) review
variable cost
parameter, constraint

The differing durations required to complete various learning algorithm experiments during tuning.

imjlk (vdp8mqzes2) extractionAnonymous (12632b8b5f) review

REFERENCES

Random Search for Hyper-Parameter Optimization
2012cited by this paper
Adaptive MCMC with Bayesian Optimization
2012cited by this paper
Multi-column deep neural networks for image classification
2012cited by this paper
Max-Margin Min-Entropy Models
2012influential reference
Selecting Receptive Fields in Deep Networks
2011cited by this paper
On Random Weights and Unsupervised Feature Learning
2011influential reference
Algorithms for Hyper-Parameter Optimization
2011cited by this paper
Convergence Rates of Efficient Global Optimization Algorithms
2011cited by this paper
Sequential Model-Based Optimization for General Algorithm Configuration
2011cited by this paper
Dealing with asynchronicity in parallel Gaussian Process based global optimization
2010cited by this paper
A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning
2010cited by this paper
Self-Paced Learning for Latent Variable Models
2010influential reference
Online Learning for Latent Dirichlet Allocation
2010influential reference
Slice sampling covariance hyperparameters of latent Gaussian models
2010cited by this paper
Learning structural SVMs with latent variables
2009influential reference
Learning Multiple Layers of Features from Tiny Images
2009influential reference
Multi-task Gaussian Process Prediction
2007cited by this paper
Semiparametric latent factor models
2005cited by this paper
A Taxonomy of Global Optimization Methods Based on Response Surfaces
2001cited by this paper
Bayesian calibration of computer models
2001cited by this paper
A New Method of Locating the Maximum Point of an Arbitrary Multipeak Curve in the Presence of Noise
1964cited by this paper

CITED BY

Modeling of urban air quality dynamics using hyperparameter-tuned boosted regression
2026cites this paper
Multi-modal Imputation for Alzheimer's Disease Classification
2026cites this paper
Novel decoupling algorithm based on transfer learning for multi-axis force sensor
2026cites this paper
Regulation of Water Body Cooling Effects in Dense Urban Areas: Roles of Spatial Configuration, Spatial Size, and Built Environment
2026cites this paper
A novel efficient algorithm for the imbalanced regression problem
2026cites this paper
SMOG: Scalable Meta-Learning for Multi-Objective Bayesian Optimization
2026cites this paper
Lightning disaster risk assessment of power distribution grid with Kmeans-SMOTE optimized BO-XGBoost and SHAP
2026cites this paper
Daily gas production prediction for a single CBM well using an integrated framework of temporal-multivariate dependency simultaneous modeling: A case study
2026cites this paper
Phenomenological Semantic Factor Method for Risk Management of Complex Systems in Drifting
2026cites this paper
A novel approach to monitor peanut equivalent water thickness through modular training and transfer learning of an improved PROSAIL model using a Wasserstein generative adversarial network
2026cites this paper
LB-MCTS: Synergizing Large Language Models and Bayesian Optimization for Efficient CASH
2026influential citation
Adaptive Time‐Aware Feature Augmentation and Improved Harris Hawks Optimization for Credit Risk Modelling
2026cites this paper
Lithology identification of gas hydrate-bearing sediments in the Muli area, Qinghai-Tibetan plateau permafrost via the incorporation of Bayesian algorithm optimized CNN-LSTM model
2026cites this paper
Search-Based Risk Feature Discovery in Document Structure Spaces under a Constrained Budget
2026cites this paper
Optimizing shape parameters in RBF methods: A systematic review of techniques, applications, and computational challenges
2026cites this paper
Deep representation learning with cross attention-based multi-feature fusion for sleep apnea detection using sleep respiratory sound
2026cites this paper
A Rapid Inversion Model of Disturbed Stress in Strata Based on Multi-layer Perceptron
2026cites this paper
Combine autoencoder and bidirectional neural network to predict emissions of plug-in hybrid electric vehicle
2026cites this paper
A resilient jacket-type offshore wind turbine system: Conception, surrogate-integrated modelling, and seismic performance evaluation
2026cites this paper
ZIVR: An Incremental Variance Reduction Technique For Zeroth-Order Composite Problems
2026cites this paper
A Fractional-Order Spatiotemporal Unified Energy Framework for Non-Repetitive LiDAR Point Cloud Registration
2026cites this paper
Training-Free Proxy-Guided Bayesian NAS for UAV-Constrained TinyML
2026cites this paper
Neural Chain-of-Thought Search: Searching the Optimal Reasoning Path to Enhance Large Language Models
2026cites this paper
Gradient-based Active Learning with Gaussian Processes for Global Sensitivity Analysis
2026cites this paper
Quantum magnetometry enhanced by machine learning
2026cites this paper
Integrating data and physics: A review of data-driven methods for computational multiphase flow
2026cites this paper
Análisis y modelado predictivo del rendimiento académico mediante técnicas de aprendizaje automático en una institución de educación secundaria
2026cites this paper
FTA-NTN: Fairness and Throughput Assurance in Non-Terrestrial Networks
2026cites this paper
Certificate-Guided Pruning for Stochastic Lipschitz Optimization
2026cites this paper
Evolution of Benchmark: Black-Box Optimization Benchmark Design through Large Language Model
2026cites this paper
pFedDHPO: A Differentiable Approach for Personalized Hyperparameter Optimization in Federated Learning
2026cites this paper
DSTMA-BLSTM algorithm for roadside air pollutant time series prediction and sensitivity analysis
2026cites this paper
An explainable machine learning framework for long-term spatiotemporal incident modeling in expanding urban rail networks
2026cites this paper
Conditioned 3D DeepKriging with locally varying anisotropy
2026cites this paper
A Comprehensive Survey on Data Distillation: Techniques, Frameworks, and Future Directions
2026cites this paper
Deep learning based iterative prediction method for backbone curves of monolithic exterior shear keys
2026cites this paper
A generative deep learning and explainable machine learning framework for heat transfer prediction and analysis in porous structures with oscillatory flows
2026cites this paper
Bayesian-optimized machine learning framework for predicting Thermal Contact Resistance
2026cites this paper
Multi-component HyChem kinetic mechanism generation using trust-region Bayesian optimization
2026cites this paper
Towards a Principled Muon under $\mu\mathsf{P}$: Ensuring Spectral Conditions throughout Training
2026cites this paper
Dynamic Hyperparameter Importance for Efficient Multi-Objective Optimization
2026cites this paper
EARL: Energy-Aware Optimization of Liquid State Machines for Pervasive AI
2026cites this paper
On the Effect of Cheating in Chess
2026cites this paper
Industrial process analytics: Multiscale approaches for integrating process knowledge and data induction for improving industrial processes and products
2026cites this paper
XGBoost–LSTM Regional Traffic Congestion Ratio Prediction Integrating Spatio‐Temporal and Weather Features
2026cites this paper
NanoSD: Edge Efficient Foundation Model for Real Time Image Restoration
2026cites this paper
Legal text summarization with optimized hybrid models and fine-tuned LLaMA-2
2026cites this paper
Neural Induction of Finite-State Transducers
2026cites this paper
Comparing automaton-based approach with machine learning models for predicting student errors in procedural training to support intelligent tutoring systems
2026cites this paper
Towards Efficient Image Deblurring for Edge Deployment
2026cites this paper
Optimization of slow positron extraction focusing device based on multi-objective Bayesian optimization
2026cites this paper
MonoRace: Winning Champion-Level Drone Racing with Robust Monocular AI
2026cites this paper
Assessing the informative value of macroeconomic indicators for public health forecasting
2026cites this paper
A Mathematical Framework for E-Commerce Sales Prediction Using Attention-Enhanced BiLSTM and Bayesian Optimization
2026cites this paper
Discovery of Feasible 3D Printing Configurations for Metal Alloys via AI-driven Adaptive Experimental Design
2026cites this paper
Bayesian Optimization for Quantum Error-Correcting Code Discovery
2026cites this paper
Dynamic modeling for spindle thermal error in machine tools under time-varying rotational speeds and cooling states
2026cites this paper
Physics-guided genetic algorithm for optimization of multi-jet impingement cooling
2026cites this paper
Influence of Performance Metrics Emphasis in Hyperparameter Tuning for Aircraft Skin Defect Detection: An Early Inspection of Weighted Average Objectives
2026cites this paper
Conditional PED-ANOVA: Hyperparameter Importance in Hierarchical & Dynamic Search Spaces
2026cites this paper
Regime-Adaptive Bayesian Optimization via Dirichlet Process Mixtures of Gaussian Processes
2026influential citation
Binary classification for imbalanced datasets using a novel metric method
2026cites this paper
Many-Problem Surrogates for Transfer Evolutionary Multiobjective Optimization With Sparse Transfer Stacking
2026cites this paper
Neural Nonmyopic Bayesian Optimization in Dynamic Cost Settings
2026cites this paper
Optimizing Ergonomics for Robot-to-Human Object Handovers
2026cites this paper
Model Stability Defense Against Model Poisoning in Federated Learning
2026cites this paper
Fractional-order graph neural networks for power grid resilience: Modeling anomalous diffusion in cascading failures
2026cites this paper
Comprehensive pier scour depth Prediction: Metaheuristic-optimized XGBoost models with interpretability insights
2026cites this paper
Dynamic meta-learning acquisition function method for Bayesian optimization with early stopping criteria for hyperparameter optimization
2026cites this paper
Health indicator modeling leveraging time-independent and time-dependent subtasks with adaptive standardization and physics-based Bayesian optimization for aeronautical structures
2026cites this paper
Prediction and multi-objective optimization of a floating wind-wave hybrid system using surrogate modeling
2026cites this paper
PINN-based estimation of convective heat transfer in jet impingement cooling
2026cites this paper
Machine learning-based automatic detection and prediction of cracks and corrosion using spatiotemporal measurements from distributed fiber optic sensors
2026cites this paper
A comparative study of deep learning models for two-dimensional MXene ferroelectric data
2026cites this paper
Machine learning reconstruction of multiyear missing nutrient data along the 137°E section, northwestern Pacific
2026cites this paper
AI-driven forecasting for battery energy management in digital twin-integrated microgrids using machine learning techniques
2026cites this paper
Development of a web-based No coding machine learning platform for hydrology and environmental management - MoolML
2026cites this paper
Synergistic importance of memory and spatial neighbourhood effects in modelling net ecosystem productivity
2026cites this paper
Accelerated discovery of corrosion-resistant multi-principal element alloys via data-augmented machine learning
2026cites this paper
A survey on physical adversarial attacks against face recognition systems
2026cites this paper
A new Gaussian process regression-based approach to leverage non-destructive evaluation data in bridge deterioration prediction models
2026cites this paper
An interpretable significant wave height forecasting model using a causal AI framework with error correction
2026cites this paper
Rapid hydrodynamic prediction of an offshore shellfish farm under extreme ocean conditions using support vector regression
2026cites this paper
A Derivative-Free Saddle-search Algorithm With Linear Convergence Rate
2026cites this paper
From Bits to Chips: An LLM-based Hardware-Aware Quantization Agent for Streamlined Deployment of LLMs
2026cites this paper
Building Energy Consumption Prediction Using CatBoost With Hybrid Random Search and Bayesian Optimization
2026cites this paper
Accurate Electricity Consumption Forecasting for Industrial Energy Management
2026cites this paper
Bayesian deep learning for probabilistic aquifer vulnerability and uncertainty prediction
2026cites this paper
Interpreting and forecasting crop-specific irrigation water productivity in an arid irrigated area using explainable machine learning and scenario simulation
2026cites this paper
Revealing park visitation under dual environmental threats in a socially stratified city: Evidence from smartphone mobility data in Dallas
2026cites this paper
Two-stage XGBoost framework with advanced hyperparameter optimization for CFRP thickness and discrete layer count prediction in circular concrete column confinement
2026cites this paper
Dynamics-Driven Policy Transfer to Heavy Wheel-Legged Robots via Imitation and Optimization
2026cites this paper
VBO-MI: A Fully Gradient-Based Bayesian Optimization Framework Using Variational Mutual Information Estimation
2026cites this paper
Advancing Model Refinement: Muon-Optimized Distillation and Quantization for LLM Deployment
2026cites this paper
A Hybrid Fuzzy Logic and Artificial Neural Network Approach for Engineering Structure Condition Assessment Based on Long-Term Inspection Data
2026cites this paper
AI-Driven Design Optimization of Engineering Systems: A Case Study on Turboshaft Engines
2026cites this paper
Quantitative Elemental Oxides Analysis of Rock Cuttings Using Laser-Induced Breakdown Spectroscopy Coupled with Bayesian Optimization and Support Vector Machine
2026cites this paper
Bayesian optimization-enhanced vision transformer for damage detection in steel truss structures using continuous wavelet transform analysis
2026cites this paper
Are machines better predictors of insider trading?
2026cites this paper
Human-LLM Collaborative Feature Engineering for Tabular Data
2026cites this paper