Convergence and Loss Bounds for Bayesian Sequence Prediction

Published 2003 in IEEE Transactions on Information Theory

ABSTRACT

The probability of observing x/sub t/ at time t, given past observations x/sub 1/...x/sub t-1/ can be computed if the true generating distribution /spl mu/ of the sequences x/sub 1/x/sub 2/x/sub 3/... is known. If /spl mu/ is unknown, but known to belong to a class /spl Mscr/ one can base one's prediction on the Bayes mix /spl xi/ defined as a weighted sum of distributions /spl nu/ /spl isin/ /spl Mscr/. Various convergence results of the mixture posterior /spl xi//sub t/ to the true posterior /spl mu//sub t/ are presented. In particular, a new (elementary) derivation of the convergence /spl xi//sub t///spl mu//sub t/ /spl rarr/ 1 is provided, which additionally gives the rate of convergence. A general sequence predictor is allowed to choose an action y/sub t/ based on x/sub 1/...x/sub t-1/ and receives loss /spl lscr//sub x(t)y(t)/ if x/sub t/ is the next symbol of the sequence. No assumptions are made on the structure of /spl lscr/ (apart from being bounded) and /spl Mscr/. The Bayes-optimal prediction scheme /spl Lambda//sub /spl xi// based on mixture /spl xi/ and the Bayes-optimal informed prediction scheme /spl Lambda//sub /spl mu// are defined and the total loss L/sub /spl xi// of /spl Lambda//sub /spl xi// is bounded in terms of the total loss L/sub /spl mu// of /spl Lambda//sub /spl mu//. It is shown that L/sub /spl xi// is bounded for bounded L/sub /spl mu// and L/sub /spl xi///L/sub /spl mu// /spl rarr/ 1 for L/sub /spl mu// /spl rarr/ /spl infin/. Convergence of the instantaneous losses is also proven.

PUBLICATION RECORD

Publication year
2003
Venue
IEEE Transactions on Information Theory
Publication date
2003-01-16
Fields of study
Mathematics, Computer Science
Identifiers
DOI 10.1109/TIT.2003.814488 arXiv cs/0301014
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
2006cited by this paper
Elements of Information Theory
2005cited by this paper
An Open Problem Regarding the Convergence of Universal A Priori Probability
2003cited by this paper
Optimality of universal Bayesian prediction for general loss and alphabet
2003cited by this paper
Self-Optimizing and Pareto-Optimal Policies in General Environments based on Bayes-Mixtures
2002cited by this paper
Convergence and Error Bounds for Universal Prediction of Nonbinary Sequences
2001cited by this paper
General Loss Bounds for Universal Sequence Prediction
2001cited by this paper
The Limits of information
2000cited by this paper
Limits of information, Markov chains, and projection
2000cited by this paper
Minimum description length induction, Bayesianism, and Kolmogorov complexity
1999cited by this paper
New Error Bounds for Solomonoff Prediction
1999cited by this paper
IEEE Transactions on Information Theory
1998cited by this paper
Universal Prediction
1998cited by this paper
The minimum description length principle and reasoning under uncertainty
1998cited by this paper
An Introduction to Kolmogorov Complexity and Its Applications
1993influential reference
How to use expert advice
1993cited by this paper
Information Bounds for the Risk of Bayesian Predictions and the Redundancy of Universal Codes
1993cited by this paper
Complexity-based induction systems: Comparisons and convergence theorems
1978influential reference
Mathematical Statistics
1944cited by this paper

CITED BY

Self-Predictive Universal AI
2023cites this paper
Putnam’s Diagonal Argument and the Impossibility of a Universal Learning Machine
2019cites this paper
Putnam’s Diagonal Argument and the Impossibility of a Universal Learning Machine
2018cites this paper
Statistical spectrum occupancy prediction for dynamic spectrum access: a classification
2018influential citation
Spectrum prediction in dynamic spectrum access systems
2018influential citation
Short Biography
2016cites this paper
Solomonoff Prediction and Occam’s Razor
2016cites this paper
Ultimate Intelligence Part I: Physical Completeness and Objectivity of Induction
2015cites this paper
The Foundations of Solomono Prediction
2013cites this paper
The Foundations of Solomonoff Prediction
2013cites this paper
A Philosophical Treatise of Universal Induction
2011cites this paper
Treatise of Universal Induction
2011cites this paper
Discrete MDL Predicts in Total Variation
2009cites this paper
Open Problems in Universal Induction & Intelligence
2009cites this paper
Universal Algorithmic Intelligence: A Mathematical Top→Down Approach
2007cites this paper
On Universal Prediction and Bayesian Confirmation
2007cites this paper
of Random Sequences
2007cites this paper
On semimeasures predicting Martin-Löf random sequences
2007cites this paper
Algorithmic complexity bounds on future prediction errors
2007cites this paper
The Missing Consistency Theorem for Bayesian Learning: Stochastic Model Selection
2006cites this paper
Complexity Monotone in Conditions and Future Prediction Errors
2006cites this paper
On the Foundations of Universal Sequence Prediction
2006cites this paper
MDL convergence speed for Bernoulli sequences
2006cites this paper
Strong Asymptotic Assertions for Discrete MDL in Regression and Classification
2005cites this paper
Monotone Conditional Complexity Bounds on Future Prediction Errors
2005cites this paper
Adaptive Online Prediction by Following the Perturbed Leader
2005cites this paper
Asymptotics of discrete MDL for online prediction
2005cites this paper
Sequential Predictions based on Algorithmic Complexity
2005influential citation
Benelearn 2005 : Annual Machine Learning Conference of Belgium and the Netherlands. CTIT Proceedings of the 14th annual Machine Learning Conference of Belgium and the Netherlands
2005cites this paper
Universal Convergence of Semimeasures on Individual Random Sequences
2004cites this paper
On the Convergence Speed of MDL Predictions for Bernoulli Sequences
2004cites this paper
Online Methods in Learning Theory
2004cites this paper
Prediction with Expert Advice by Following the Perturbed Leader for General Weights
2004cites this paper
Optimality of Universal Bayesian Sequence Prediction for General Loss and Alphabet
2003influential citation
Sequence Prediction Based on Monotone Complexity
2003influential citation
Optimality of universal Bayesian sequence prediction for general loss and alphabet
2003influential citation