Inferential Induction: Joint Bayesian Estimation of MDPs and Value Functions

Christos Dimitrakakis,Hannes Eriksson,Emilio Jorge,Divya Grover,D. Basu

Published 2020 in arXiv.org

ABSTRACT

Bayesian reinforcement learning (BRL) offers a decision-theoretic solution to the problem of reinforcement learning. However, typical model-based BRL algorithms have focused either on ma intaining a posterior distribution on models or value functions and combining this with approx imate dynamic programming or tree search. This paper describes a novel backwards induction pri nciple for performing joint Bayesian estimation of models and value functions, from which many new BRL algorithms can be obtained. We demonstrate this idea with algorithms and experiments in discrete state spaces.

PUBLICATION RECORD

Publication year
2020
Venue
arXiv.org
Publication date
2020-02-08
Fields of study
Mathematics, Computer Science
Identifiers
arXiv 2002.03098
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Exploration by Distributional Reinforcement Learning
2018cited by this paper
A Distributional Perspective on Reinforcement Learning
2017cited by this paper
Generalization and Exploration via Randomized Value Functions
2014cited by this paper
Robust Bayesian Reinforcement Learning through Tight Lower Bounds
2011cited by this paper
Nonparametric Return Distribution Approximation for Reinforcement Learning
2010cited by this paper
Gaussian process dynamic programming
2009cited by this paper
Bayesian Policy Gradient Algorithms
2006cited by this paper
Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning
2003cited by this paper
A Bayesian Framework for Reinforcement Learning
2000cited by this paper
Bayesian Q-Learning
1998cited by this paper
Markov Decision Processes: Discrete Stochastic Dynamic Programming
1994cited by this paper
Optimal Statistical Decisions
1970influential reference
ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES
1933cited by this paper

CITED BY

No citing papers are available for this paper.