We introduce Threatened Markov Decision Processes (TMDPs) as an extension of the classical Markov Decision Process framework for Reinforcement Learning (RL). TMDPs allow suporting a decision maker against potential opponents in a RL context. We also propose a level-k thinking scheme resulting in a novel learning approach to deal with TMDPs. After introducing our framework and deriving theoretical results, relevant empirical evidence is given via extensive experiments, showing the benefits of accounting for adversaries in RL while the agent learns
Opponent Aware Reinforcement Learning
Víctor Gallego,Roi Naveiro,D. Insua,D. Gómez‐Ullate
Published 2019 in arXiv.org
ABSTRACT
PUBLICATION RECORD
- Publication year
2019
- Venue
arXiv.org
- Publication date
2019-08-22
- Fields of study
Mathematics, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-42 of 42 references · Page 1 of 1
CITED BY
Showing 1-7 of 7 citing papers · Page 1 of 1