Online learning with kernel losses

Aldo Pacchiano,Niladri S. Chatterji,P. Bartlett

Published 2018 in International Conference on Machine Learning

ABSTRACT

We present a generalization of the adversarial linear bandits framework, where the underlying losses are kernel functions (with an associated reproducing kernel Hilbert space) rather than linear functions. We study a version of the exponential weights algorithm and bound its regret in this setting. Under conditions on the eigendecay of the kernel we provide a sharp characterization of the regret for this algorithm. When we have polynomial eigendecay $\mu_j \le \mathcal{O}(j^{-\beta})$, we find that the regret is bounded by $\mathcal{R}_n \le \mathcal{O}(n^{\beta/(2(\beta-1))})$; while under the assumption of exponential eigendecay $\mu_j \le \mathcal{O}(e^{-\beta j })$, we get an even tighter bound on the regret $\mathcal{R}_n \le \mathcal{O}(n^{1/2}\log(n)^{1/2})$. We also study the full information setting when the underlying losses are kernel functions and present an adapted exponential weights algorithm and a conditional gradient descent algorithm.

PUBLICATION RECORD

Publication year
2018
Venue
International Conference on Machine Learning
Publication date
2018-02-27
Fields of study
Mathematics, Computer Science
Identifiers
arXiv 1802.09732
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Algorithmic Chaining and the Role of Partial Feedback in Online Nonparametric Learning
2017cited by this paper
Follow the Compressed Leader: Faster Algorithms for Matrix Multiplicative Weight Updates
2017cited by this paper
An optimal algorithm for bandit convex optimization
2016cited by this paper
Introduction to Online Convex Optimization
2016influential reference
Minimizing Regret on Reflexive Banach Spaces and Nash Equilibria in Continuous Zero-Sum Games
2016influential reference
Minimizing Regret on Reflexive Banach Spaces and Learning Nash Equilibria in Continuous Zero-Sum Games
2016cited by this paper
Kernel-based methods for bandit convex optimization
2016influential reference
Bandit Smooth Convex Optimization: Improving the Bias-Variance Tradeoff
2015cited by this paper
Online Nonparametric Regression with General Loss Functions
2015cited by this paper
Online Learning of Eigenvectors
2015cited by this paper
Bandit Convex Optimization: Towards Tight Bounds
2014cited by this paper
Online Non-Parametric Regression
2014cited by this paper
Unconstrained Online Linear Learning in Hilbert Spaces: Minimax Algorithms and Normal Approximations
2014cited by this paper
Online PCA with Optimal Regrets
2013cited by this paper
Finite-Time Analysis of Kernelised Contextual Bandits
2013cited by this paper
Combinatorial Bandits
2012influential reference
Projection-free Online Learning
2012influential reference
The Multiplicative Weights Update Method: a Meta-Algorithm and Applications
2012cited by this paper
Towards Minimax Policies for Online Linear Optimization with Bandit Feedback
2012influential reference
Improved Regret Guarantees for Online Smooth Convex Optimization with Bandit Feedback
2011cited by this paper
Convex Optimization without Projection Steps
2011cited by this paper
Stochastic Convex Optimization with Bandit Feedback
2011cited by this paper
Optimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback.
2010cited by this paper
Randomized Online PCA Algorithms with Regret Bounds that are Logarithmic in the Dimension
2008influential reference
Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization
2008influential reference
Stochastic Linear Optimization under Bandit Feedback
2008influential reference
An Efficient Algorithm for Bandit Linear Optimization
2008cited by this paper
A combinatorial, primal-dual approach to semidefinite programs
2007cited by this paper
The Price of Bandit Information for Online Optimization
2007influential reference
A tutorial on geometric programming
2007influential reference
A primal-dual perspective of online learning algorithms
2007cited by this paper
Prediction, learning, and games
2006influential reference
On singular values of matrices with independent rows
2006cited by this paper
Online variance minimization
2006influential reference
Matrix Exponentiated Gradient Updates for On-line Learning and Bregman Projection
2005cited by this paper
Efficient algorithms for online decision problems
2005cited by this paper
On the Convergence of Eigenspaces in Kernel Principal Component Analysis
2005influential reference
Convex Optimization
2004cited by this paper
Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary
2004cited by this paper
Online convex optimization in the bandit setting: gradient descent without a gradient
2004cited by this paper
Adaptive routing with end-to-end feedback: distributed learning and geometric approaches
2004cited by this paper
Nearly Tight Bounds for the Continuum-Armed Bandit Problem
2004cited by this paper
Online Convex Programming and Generalized Infinitesimal Gradient Ascent
2003cited by this paper
Eecient Algorithms for Online Decision Problems
2003cited by this paper
The Geometry of Logconcave Functions and Sampling Algorithms ∗
2003cited by this paper
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
2003cited by this paper
An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods
2001cited by this paper
An Introduction to Support Vector Machines and Other Kernel-based Learning Methods
2000cited by this paper
An Elementary Introduction to Modern Convex Geometry
1997cited by this paper
Probability Inequalities for Sums of Bounded Random Variables
1994cited by this paper
Figure
1972cited by this paper
The variation of the spectrum of a normal matrix
1953cited by this paper
Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations
1909cited by this paper
Philosophical transactions of the Royal Society of London. Series A, Containing papers of a mathematical or physical character
year unknowncited by this paper

CITED BY

Differentially Private Kernelized Contextual Bandits
2025cites this paper
Adversarial Contextual Bandits Go Kernelized
2023cites this paper
Contextual Gaussian Process Bandits with Neural Networks
2023cites this paper
Random Exploration in Bayesian Optimization: Order-Optimal Regret and Computational Efficiency
2023cites this paper
Sample Complexity of Kernel-Based Q-Learning
2023cites this paper
Reward Imputation with Sketching for Contextual Batched Bandits
2022cites this paper
Learning curves for Gaussian process regression with power-law priors and targets
2021cites this paper
Kernel-based online regression with canal loss
2021cites this paper
Efficient Bandit Convex Optimization: Beyond Linear Losses
2021cites this paper
Dynamic Pricing and Demand Learning on a Large Network of Products: A PAC-Bayesian Approach
2021cites this paper
Information Consistency of Stochastic Kriging and Its Implications
2021cites this paper
Approximation Theory Based Methods for RKHS Bandits
2020cites this paper
On Information Gain and Regret Bounds in Gaussian Process Bandits
2020influential citation
Rate-adaptive model selection over a collection of black-box contextual bandit algorithms
2020influential citation
Generalized Policy Elimination: an efficient algorithm for Nonparametric Contextual Bandits
2020cites this paper
Improving Neural Language Generation with Spectrum Control
2020cites this paper
Bandit Principal Component Analysis
2019cites this paper