Clustering Markov Decision Processes For Continual Transfer

M. H. Mahmud,Majd Hawasly,Benjamin Rosman,S. Ramamoorthy

Published 2013 in arXiv.org

ABSTRACT

We consider the problem of a lifelong learning agent that continually encounters a sequence of tasks. After sufficient exposure to the domain, it is conceivable that the agent has developed a large library of task-specific optimal policies that may then be used as a basis for transfer learning - to speed up operation in a novel instance of a task from this domain. We present an algorithm for continual transfer learning in this setting of reinforcement learning with MDPs. A key question now is - is it possible to concisely encode the library to enable efficient transfer and search? Towards this end, we present a theory, framework and algorithms to cluster MDPs to select as source tasks a small subset of a large number of previous tasks. Our contributions are as follows. We present a principled policy-reuse algorithm that optimally reuses a given set of source policies when solving a particular MDP. We then present a framework for clustering the previously solved MDPs to obtain a source MDP set which optimize the reuse performance of our transfer algorithm. The framework consists of (i) a class of distance functions over MDPs which helps define clustering of MDPs; (ii) a cost function which measures how good a particular clustering is for generating useful source tasks for the transfer algorithm and (iii) provably convergent optimization algorithm for finding the optimal cluster. Finally, we present a set of experiments that illustrate the efficacy of our approach.

PUBLICATION RECORD

  • Publication year

    2013

  • Venue

    arXiv.org

  • Publication date

    2013-11-15

  • Fields of study

    Computer Science

  • Identifiers
  • External record

    Open on Semantic Scholar

  • Source metadata

    Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-53 of 53 references · Page 1 of 1

CITED BY

Showing 1-22 of 22 citing papers · Page 1 of 1