Optimal Rates for Zero-Order Convex Optimization: The Power of Two Function Evaluations

John C. Duchi,Michael I. Jordan,M. Wainwright,Andre Wibisono

Published 2013 in IEEE Transactions on Information Theory

ABSTRACT

We consider derivative-free algorithms for stochastic and nonstochastic convex optimization problems that use only function values rather than gradients. Focusing on nonasymptotic bounds on convergence rates, we show that if pairs of function values are available, algorithms for d-dimensional optimization that use gradient estimates based on random perturbations suffer a factor of at most √d in convergence rate over traditional stochastic gradient methods. We establish such results for both smooth and nonsmooth cases, sharpening previous analyses that suggested a worse dimension dependence, and extend our results to the case of multiple (m ≥ 2) evaluations. We complement our algorithmic development with information-theoretic lower bounds on the minimax convergence rate of such problems, establishing the sharpness of our achievable results up to constant (sometimes logarithmic) factors.

PUBLICATION RECORD

Publication year
2013
Venue
IEEE Transactions on Information Theory
Publication date
2013-12-07
Fields of study
Mathematics, Computer Science
Identifiers
DOI 10.1109/TIT.2015.2409256 arXiv 1312.2139
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Graphical Models
2020cited by this paper
Introduction to Optimization
2019cited by this paper
Random Gradient-Free Minimization of Convex Functions
2015influential reference
Stochastic Approximation approach to Stochastic Programming
2013influential reference
Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming
2013influential reference
Finite Sample Convergence Rates of Zero-Order Stochastic Optimization Methods
2012cited by this paper
Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems
2012influential reference
Query Complexity of Derivative-Free Optimization
2012cited by this paper
On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization
2012influential reference
Randomized Smoothing for Stochastic Optimization
2011influential reference
Stochastic Convex Optimization with Bandit Feedback
2011cited by this paper
On the Fundamental Limits of Adaptive Sensing
2011cited by this paper
Information-Theoretic Lower Bounds on the Oracle Complexity of Stochastic Convex Optimization
2010cited by this paper
Optimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback.
2010influential reference
Introduction to derivative-free optimization
2010cited by this paper
Information-theoretic lower bounds on the oracle complexity of convex optimization
2009influential reference
Minimax Policies for Adversarial and Stochastic Bandits
2009cited by this paper
Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization
2009cited by this paper
High-Probability Regret Bounds for Bandit Online Linear Optimization
2008cited by this paper
Graphical Models, Exponential Families, and Variational Inference
2008cited by this paper
Introduction to Nonparametric Estimation
2008influential reference
Introduction to Stochastic Search and Optimization. Estimation, Simulation, and Control (Spall, J.C.; 2003) [book review]
2007cited by this paper
Smooth minimization of non-smooth functions
2005cited by this paper
Learning structured prediction models: a large margin approach
2005cited by this paper
Elements of Information Theory
2005cited by this paper
Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control
2004cited by this paper
Online convex optimization in the bandit setting: gradient descent without a gradient
2004influential reference
Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control:Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control
2004cited by this paper
Mirror descent and nonlinear projected subgradient methods for convex optimization
2003influential reference
Introduction to stochastic search and optimization - estimation, simulation, and control
2003cited by this paper
Stochastic Approximation and Recursive Algorithms and Applications
2003cited by this paper
Online Convex Programming and Generalized Infinitesimal Gradient Ascent
2003cited by this paper
On the generalization ability of on-line learning algorithms
2001cited by this paper
The Ordered Subsets Mirror Descent Optimization Method with Applications to Tomography
2001cited by this paper
The concentration of measure phenomenon
2001influential reference
Metric characterization of random variables and random processes
2000cited by this paper
The Robustness of the p-Norm Algorithms
1999cited by this paper
Assouad, Fano, and Le Cam
1997cited by this paper
Exponentiated Gradient Versus Gradient Descent for Linear Predictors
1997cited by this paper
Convex analysis and minimization algorithms
1993cited by this paper
Introduction to optimization
1987cited by this paper
Asymptotically efficient adaptive allocation rules
1985cited by this paper
Problem Complexity and Method Efficiency in Optimization
1983influential reference
Stochastic optimization problems with nondifferentiable cost functionals
1973cited by this paper
Some aspects of the sequential design of experiments
1952cited by this paper
Asymptotically Efficient Adaptive Allocation Rules
year unknowncited by this paper

CITED BY

Zeroth-order parallel sampling
2026influential citation
Reinforcement Learning in Real Option Models
2026cites this paper
Escaping from saddle points with perturbed gradient estimation
2026cites this paper
Wavefront sensorless flat-top beam shaping with multi-stage ASPGD
2026cites this paper
Towards efficient and reliable artificial intelligence through neuromorphic principles.
2026cites this paper
Zero-Order Optimization for LLM Fine-Tuning via Learnable Direction Sampling
2026cites this paper
Small Gradient Norm Regret for Online Convex Optimization
2026cites this paper
Deterministic Zeroth-Order Mirror Descent via Vector Fields with A Posteriori Certification
2026cites this paper
Zeroth-Order Kronecker Optimization for Pretraining Language Models
2026cites this paper
Zeroth-Order Stackelberg Control in Combinatorial Congestion Games
2026cites this paper
Hi-ZFO: Hierarchical Zeroth- and First-Order LLM Fine-Tuning via Importance-Guided Tensor Selection
2026cites this paper
Improved Dimension Dependence for Bandit Convex Optimization with Gradient Variations
2026cites this paper
Distributed Stochastic Frank–Wolfe Algorithms With Zeroth and First-Order Variance Reduction for Aggregative Optimization
2026cites this paper
Gradient-Free Approaches is a Key to an Efficient Interaction with Markovian Stochasticity
2026influential citation
ZIVR: An Incremental Variance Reduction Technique For Zeroth-Order Composite Problems
2026cites this paper
Zeroth-Order Feedback-Based Optimization for Distributed Energy Management
2026cites this paper
Powering Up Zeroth-Order Training via Subspace Gradient Orthogonalization
2026influential citation
The Blessing of Dimensionality in LLM Fine-tuning: A Variance-Curvature Perspective
2026cites this paper
Guiding the Recommender: Information-Aware Auto-Bidding for Content Promotion
2026cites this paper
AGZO: Activation-Guided Zeroth-Order Optimization for LLM Fine-Tuning
2026cites this paper
VAMO: Efficient Zeroth-Order Variance Reduction for SGD with Faster Convergence
2025cites this paper
MambaExtend: A Training-Free Approach to Improve Long Context Extension of Mamba
2025cites this paper
Query-Efficient Zeroth-Order Algorithms for Nonconvex Constrained Optimization
2025cites this paper
ComPO: Preference Alignment via Comparison Oracles
2025cites this paper
One-Point Sampling for Distributed Bandit Convex Optimization With Time-Varying Constraints
2025cites this paper
The Derivative-Free Fully-Corrective Frank-Wolfe Algorithm for Optimizing Functionals Over Probability Spaces
2025cites this paper
Estimating the Effects of Sample Training Orders for Large Language Models without Retraining
2025cites this paper
Bregman Linearized Augmented Lagrangian Method for Nonconvex Constrained Stochastic Zeroth-order Optimization
2025cites this paper
A Structured Tour of Optimization with Finite Differences
2025cites this paper
Scaling Recurrent Neural Networks to a Billion Parameters with Zero-Order Optimization
2025cites this paper
Quantum Speedups for Markov Chain Monte Carlo Methods with Application to Optimization
2025cites this paper
Revisiting Zeroth-Order Optimization: Minimum-Variance Two-Point Estimators and Directionally Aligned Perturbations
2025influential citation
Simulation Optimization and Stochastic Gradients: Theory & Practice
2025cites this paper
Learning Generalized Nash Equilibria in Non-Monotone Games with Quadratic Costs
2025cites this paper
ZO-ASR: Zeroth-Order Fine-Tuning of Speech Foundation Models without Back-Propagation
2025cites this paper
Bilevel ZOFO: Efficient LLM Fine-Tuning and Meta-Training
2025cites this paper
Private Zeroth-Order Optimization with Public Data
2025influential citation
BOND: License to Train with Black-Box Functions
2025cites this paper
ZOQO: Zero-Order Quantized Optimization
2025cites this paper
NoProp: Training Neural Networks without Full Back-propagation or Full Forward-propagation
2025cites this paper
Deep learning of PDE correction and mesh adaption without automatic differentiation
2025cites this paper
Low-Rank Curvature for Zeroth-Order Optimization in LLM Fine-Tuning
2025cites this paper
Scalable Back-Propagation-Free Training of Optical Physics-Informed Neural Networks
2025cites this paper
Zeroth Order Optimization for Pretraining Language Models
2025cites this paper
ZO2: Scalable Zeroth-Order Fine-Tuning for Extremely Large Language Models with Limited GPU Memory
2025cites this paper
Dimension-Free Estimators of Gradients of Functions with(out) Non-Independent Variables
2025cites this paper
ConMeZO: Adaptive Descent-Direction Sampling for Gradient-Free Finetuning of Large Language Models
2025cites this paper
Privacy Amplification in Differentially Private Zeroth-Order Optimization with Hidden States
2025influential citation
ZIP: An Efficient Zeroth-order Prompt Tuning for Black-box Vision-Language Models
2025cites this paper
Asymptotic Proximal Point Methods for Global Optimization
2025cites this paper
Towards Fast LLM Fine-tuning through Zeroth-Order Optimization with Projected Gradient-Aligned Perturbations
2025cites this paper
Multi-Objective min-max Online Convex Optimization
2025cites this paper
Query-Efficient Zeroth-Order Algorithms for Nonconvex Optimization
2025cites this paper
CR-Net: Scaling Parameter-Efficient Training with Cross-Layer Low-Rank Structure
2025cites this paper
AI-Aided Annealed Langevin Dynamics for Rapid Optimization of Programmable Channels
2025cites this paper
High-Probability Analysis of Online and Federated Zero-Order Optimisation
2025cites this paper
On the Optimal Construction of Unbiased Gradient Estimators for Zeroth-Order Optimization
2025influential citation
Non-Stationary Bandit Convex Optimization: An Optimal Algorithm with Two-Point Feedback
2025influential citation
Subspace-based Approximate Hessian Method for Zeroth-Order Optimization
2025cites this paper
Communication-Efficient Zero-Order and First-Order Federated Learning Methods over Wireless Networks
2025cites this paper
Zeroth-Order Optimization Finds Flat Minima
2025cites this paper
On the Inherent Privacy of Zeroth-Order Projected Gradient Descent
2025influential citation
A line search framework with restarting for noisy optimization problems
2025cites this paper
Investigating the Impact of Measurement Variance on Gene Circuit Model Parameterization
2025cites this paper
Leveraging Coordinate Momentum in SignSGD and Muon: Memory-Optimized Zero-Order
2025cites this paper
MobiEdit: Resource-efficient Knowledge Editing for Personalized On-device LLMs
2025cites this paper
A Structured Proximal Stochastic Variance Reduced Zeroth-order Algorithm
2025influential citation
A Zeroth-Order Extra-Gradient Method For Black-Box Constrained Optimization
2025cites this paper
Duality and Policy Evaluation in Distributionally Robust Bayesian Diffusion Control
2025cites this paper
Convergence rate of payoff-based generalized Nash equilibrium learning
2025cites this paper
Hierarchical dynamic graphical games for optimal leader-follower consensus control
2025cites this paper
Warming Up for Zeroth-Order Federated Pre-Training with Low Resource Clients
2025cites this paper
Two-point Random Gradient-free Methods for Model-free Feedback Optimization
2025cites this paper
The Multi-Query Paradox in Zeroth-Order Optimization
2025cites this paper
Training-Free Token Pruning via Zeroth-Order Gradient Estimation in Vision-Language Models
2025cites this paper
Downgrade to Upgrade: Optimizer Simplification Enhances Robustness in LLM Unlearning
2025cites this paper
NoProp: Training Neural Networks without Back-propagation or Forward-propagation
2025cites this paper
Zeroth-Order Sharpness-Aware Learning with Exponential Tilting
2025cites this paper
SAGE: A Set-based Adaptive Gradient Estimator
2025cites this paper
A Parameter-Free and Near-Optimal Zeroth-Order Algorithm for Stochastic Convex Optimization
2025influential citation
Pareto Set Learning for Multi-Objective Reinforcement Learning
2025cites this paper
Globally convergent derivative-free methods in nonconvex optimization with and without noise
2025cites this paper
Solving Infinite-Player Games with Player-to-Strategy Networks
2025cites this paper
Elucidating Subspace Perturbation in Zeroth-Order Optimization: Theory and Practice at Scale
2025cites this paper
Bilevel ZOFO: Bridging Parameter-Efficient and Zeroth-Order Techniques for Efficient LLM Fine-Tuning and Meta-Training
2025cites this paper
Learning to Instruct: Fine-Tuning a Task-Aware Instruction Optimizer for Black-Box LLMs
2025cites this paper
Unifying Zeroth-Order Optimization and Genetic Algorithms for Reinforcement Learning
2025cites this paper
Distributed Stochastic Zeroth-Order Optimization With Compressed Communication
2025cites this paper
LORENZA: Enhancing Generalization in Low-Rank Gradient LLM Training via Efficient Zeroth-Order Adaptive SAM
2025cites this paper
Stochastic Gradients: Optimization, Simulation, Randomization, and Sensitivity Analysis
2025cites this paper
Zeroth-Order Extra-Gradient Method for Constrained Convex Optimization
2025cites this paper
Robust Network Optimization by Deep Generative Models and Stochastic Optimization
2025cites this paper
One‐Point Residual Feedback Algorithms for Distributed Online Convex and Non‐Convex Optimization
2025cites this paper
Efficient Personalization of Quantized Diffusion Model without Backpropagation
2025cites this paper
Gradient Compressed Sensing: A Query-Efficient Gradient Estimator for High-Dimensional Zeroth-Order Optimization
2024cites this paper
Efficient Black-box Adversarial Attacks via Bayesian Optimization Guided by a Function Prior
2024cites this paper
Highly Smooth Zeroth-Order Methods for Solving Optimization Problems under the PL Condition
2024cites this paper
Achieving Dimension-Free Communication in Federated Learning via Zeroth-Order Optimization
2024cites this paper
Gradient Testing and Estimation by Comparisons
2024cites this paper
Guided multi-objective generative AI to enhance structure-based drug design
2024cites this paper