Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes

Jordi Grau-Moya,Felix Leibfried,Tim Genewein,Daniel A. Braun

Published 2016 in ECML/PKDD

ABSTRACT

Information-theoretic principles for learning and acting have been proposed to solve particular classes of Markov Decision Problems. Mathematically, such approaches are governed by a variational free energy principle and allow solving MDP planning problems with information-processing constraints expressed in terms of a Kullback-Leibler divergence with respect to a reference distribution. Here we consider a generalization of such MDP planners by taking model uncertainty into account. As model uncertainty can also be formalized as an information-processing constraint, we can derive a unified solution from a single generalized variational principle. We provide a generalized value iteration scheme together with a convergence proof. As limit cases, this generalized scheme includes standard value iteration with a known model, Bayesian MDP planning, and robust planning. We demonstrate the benefits of this approach in a grid world simulation.

PUBLICATION RECORD

Publication year
2016
Venue
ECML/PKDD
Publication date
2016-04-07
Fields of study
Computer Science
Identifiers
DOI 10.1007/978-3-319-46227-1_30 arXiv 1604.02080
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Taming the Noise in Reinforcement Learning via Soft Updates
2015cited by this paper
RLPy: a value-function-based reinforcement learning framework for education and research
2015cited by this paper
Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach
2015influential reference
Monte Carlo methods for exact & efficient solution of the generalized optimality equations
2014cited by this paper
Generalized Thompson sampling for sequential decision-making and causal inference
2013cited by this paper
Scalable and Efficient Bayes-Adaptive Reinforcement Learning Based on Monte-Carlo Tree Search
2013cited by this paper
Robust Markov Decision Processes
2013cited by this paper
Risk-Sensitive Reinforcement Learning
2013influential reference
Trading Value and Information in MDPs
2012cited by this paper
Thermodynamics as a theory of decision-making with information-processing costs
2012cited by this paper
Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search
2012cited by this paper
Robustness and risk-sensitivity in Markov decision processes
2012influential reference
Information Theory of Decisions and Actions
2011influential reference
Path integral control and bounded rationality
2011cited by this paper
A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes
2011cited by this paper
Relative Entropy Policy Search
2010cited by this paper
Risk Sensitive Path Integral Control
2010cited by this paper
Model-based reinforcement learning with nearly tight exploration complexity bounds
2010cited by this paper
Efficient computation of optimal actions
2009cited by this paper
Reinforcement Learning in Finite MDPs: PAC Analysis
2009cited by this paper
Neuro-Dynamic Programming
2009cited by this paper
A Bayesian Rule for Adaptive Control based on Causal Interventions
2009cited by this paper
A Minimum Relative Entropy Principle for Learning and Acting
2008cited by this paper
The many faces of optimism: a unifying approach
2008cited by this paper
Bias and Variance Approximation in Value Function Estimates
2007cited by this paper
Linearly-solvable Markov decision problems
2006cited by this paper
Robust Control of Markov Decision Processes with Uncertain Transition Matrices
2005cited by this paper
Robust Dynamic Programming
2005cited by this paper
Linear theory for control of nonlinear stochastic systems.
2004cited by this paper
Optimal learning: computational procedures for bayes-adaptive markov decision processes
2002cited by this paper
Adaptive Control
1993cited by this paper
Dynamic Programming
1993cited by this paper

CITED BY

Inverse Decision Modeling: Learning Interpretable Representations of Behavior
2023cites this paper
Complex behavior from intrinsic motivation to occupy future action-state path space
2022cites this paper
Variational Inference for Model-Free and Model-Based Reinforcement Learning
2022influential citation
Beyond Bayes-optimality: meta-learning what you know you don't know
2022cites this paper
Model-Free Risk-Sensitive Reinforcement Learning
2021cites this paper
Bellman: A Toolbox for Model-Based Reinforcement Learning in TensorFlow
2021cites this paper
Bounded Rationality in Learning, Perception, Decision-Making, and Stochastic Games
2021cites this paper
Reinforcement Learning with Subspaces using Free Energy Paradigm
2020cites this paper
An Information-Theoretic Approach for Path Planning in Agents with Computational Constraints
2020cites this paper
A Tutorial on Sparse Gaussian Processes and Variational Inference
2020cites this paper
The two kinds of free energy and the Bayesian revolution
2020cites this paper
Dynamic allocation of limited memory resources in reinforcement learning
2020cites this paper
Specialization in Hierarchical Learning Systems
2020cites this paper
Mutual-Information Regularization in Markov Decision Processes and Actor-Critic Learning
2019cites this paper
MUTUAL-INFORMATION REGULARIZATION
2019cites this paper
Robust Reinforcement Learning for Continuous Control with Model Misspecification
2019cites this paper
Disentangled Skill Embeddings for Reinforcement Learning
2019cites this paper
An Information-theoretic On-line Learning Principle for Specialization in Hierarchical Decision-Making Systems
2019cites this paper
A Unified Bellman Optimality Principle Combining Reward Maximization and Empowerment
2019influential citation
Balancing Two-Player Stochastic Games with Soft Q-Learning
2018cites this paper
Advancing Markov Decision Processes and Multivariate Gaussian Processes as Tools for Computational Psychiatry
2018cites this paper
Task-Driven Estimation and Control via Information Bottlenecks
2018cites this paper
Bounded Rational Decision-Making with Adaptive Neural Network Priors
2018cites this paper
Non-Equilibrium Relations for Bounded Rational Decision-Making in Changing Environments
2017cites this paper
AI Safety Gridworlds
2017cites this paper
Decision-Making under Bounded Rationality and Model Uncertainty: an Information-Theoretic Approach
2017cites this paper
Hierarchical state abstractions for decision-making problems with computational constraints
2017influential citation
An Information-Theoretic Optimality Principle for Deep Reinforcement Learning
2017cites this paper