Policy Gradient Methods in the Presence of Symmetries and State Abstractions

P. Panangaden,S. Rezaei-Shoshtari,Rosie Zhao,D. Meger,Doina Precup

Published 2023 in Journal of machine learning research

ABSTRACT

Reinforcement learning (RL) on high-dimensional and complex problems relies on abstraction for improved efficiency and generalization. In this paper, we study abstraction in the continuous-control setting, and extend the definition of Markov decision process (MDP) homomorphisms to the setting of continuous state and action spaces. We derive a policy gradient theorem on the abstract MDP for both stochastic and deterministic policies. Our policy gradient results allow for leveraging approximate symmetries of the environment for policy optimization. Based on these theorems, we propose a family of actor-critic algorithms that are able to learn the policy and the MDP homomorphism map simultaneously, using the lax bisimulation metric. Finally, we introduce a series of environments with continuous symmetries to further demonstrate the ability of our algorithm for action abstraction in the presence of such symmetries. We demonstrate the effectiveness of our method on our environments, as well as on challenging visual control tasks from the DeepMind Control Suite. Our method's ability to utilize MDP homomorphisms for representation learning leads to improved performance, and the visualizations of the latent space clearly demonstrate the structure of the learned abstraction.

PUBLICATION RECORD

Publication year
2023
Venue
Journal of machine learning research
Publication date
2023-05-09
Fields of study
Computer Science
Identifiers
DOI 10.48550/arXiv.2305.05666 arXiv 2305.05666
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

EqR: Equivariant Representations for Data-Efficient Reinforcement Learning
2022cited by this paper
SO(2)-Equivariant Reinforcement Learning
2022cited by this paper
Learning Symmetric Embeddings for Equivariant World Models
2022cited by this paper
Approximate Policy Iteration with Bisimulation Metrics
2022cited by this paper
Continuous MDP Homomorphisms and Homomorphic Policy Gradient
2022influential reference
Deep Reinforcement Learning at the Edge of the Statistical Precipice
2021influential reference
SyMetric: Measuring the Quality of Learnt Hamiltonian Dynamics Inferred from Vision
2021cited by this paper
Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking
2021cited by this paper
MICo: Improved representations via sampling-based state similarity for Markov decision processes
2021cited by this paper
E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials
2021cited by this paper
Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning
2021cited by this paper
Metrics and continuity in reinforcement learning
2021cited by this paper
Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning
2021influential reference
Highly accurate protein structure prediction with AlphaFold
2021cited by this paper
APS: Active Pretraining with Successor Features
2021cited by this paper
Proper Value Equivalence
2021cited by this paper
SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies
2021cited by this paper
UNiTE: Unitary N-body Tensor Equivariant Network with Applications to Quantum Chemistry
2021cited by this paper
A Practical Method for Constructing Equivariant Multilayer Perceptrons for Arbitrary Matrix Groups
2021cited by this paper
On The Effect of Auxiliary Tasks on Representation Dynamics
2021cited by this paper
Equivariant Point Network for 3D Point Cloud Analysis
2021cited by this paper
E(n) Equivariant Graph Neural Networks
2021cited by this paper
Multi-Agent MDP Homomorphic Networks
2021cited by this paper
Automatic Symmetry Discovery with Lie Algebra Convolutional Network
2021cited by this paper
Towards Robust Bisimulation Metric Learning
2021cited by this paper
Deep Reinforcement Learning for Autonomous Driving: A Survey
2020cited by this paper
Value Preserving State-Action Abstractions
2020cited by this paper
Generalization in Reinforcement Learning by Soft Data Augmentation
2020cited by this paper
The Value Equivalence Principle for Model-Based Reinforcement Learning
2020cited by this paper
Decoupling Representation Learning from Reinforcement Learning
2020cited by this paper
Self-Supervised Policy Adaptation during Deployment
2020cited by this paper
MDP Homomorphic Networks: Group Symmetries in Reinforcement Learning
2020cited by this paper
Learning Invariant Representations for Reinforcement Learning without Reconstruction
2020influential reference
Hierarchical, rotation‐equivariant neural networks to select structural models of protein complexes
2020cited by this paper
Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels
2020influential reference
Learning discrete state abstractions with deep variational inference
2020cited by this paper
Plannable Approximations to MDP Homomorphisms: Equivariance under Actions
2020cited by this paper
Learning Group Structure and Disentangled Representations of Dynamical Environments
2020cited by this paper
Network Randomization: A Simple Technique for Generalization in Deep Reinforcement Learning
2019cited by this paper
Hamiltonian Neural Networks
2019cited by this paper
Pulmonary nodule detection in CT scans with equivariant CNNs
2019cited by this paper
State Abstraction as Compression in Apprenticeship Learning
2019cited by this paper
Self-Supervised Generalisation with Meta Auxiliary Learning
2019cited by this paper
Dream to Control: Learning Behaviors by Latent Imagination
2019cited by this paper
Gauge Equivariant Convolutional Networks and the Icosahedral CNN
2019cited by this paper
The Natural Language of Actions
2019cited by this paper
Learning Action Representations for Reinforcement Learning
2019cited by this paper
Scalable methods for computing state similarity in deterministic Markov Decision Processes
2019cited by this paper
Improving Sample Efficiency in Model-Free Reinforcement Learning from Images
2019influential reference
PyTorch: An Imperative Style, High-Performance Deep Learning Library
2019influential reference
Symmetry-Based Disentangled Representation Learning requires Interaction with Environments
2019cited by this paper
Dynamics-aware Embeddings
2019cited by this paper
Unsupervised State Representation Learning in Atari
2019cited by this paper
DeepMDP: Learning Continuous Latent Space Models for Representation Learning
2019influential reference
Calculus On Manifolds: A Modern Approach To Classical Theorems Of Advanced Calculus
2019cited by this paper
Towards a Definition of Disentangled Representations
2018cited by this paper
Tensor Field Networks: Rotation- and Translation-Equivariant Neural Networks for 3D Point Clouds
2018cited by this paper
Distributed Distributional Deterministic Policy Gradients
2018cited by this paper
Addressing Function Approximation Error in Actor-Critic Methods
2018influential reference
World Models
2018cited by this paper
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
2018influential reference
Combining Push-Forward Measures and Bayes' Rule to Construct Consistent Solutions to Stochastic Inverse Problems
2018cited by this paper
Rotation Equivariant CNNs for Digital Pathology
2018cited by this paper
Neural scene representation and rendering
2018cited by this paper
Clebsch-Gordan Nets: a Fully Fourier Space Spherical Convolutional Neural Network
2018cited by this paper
Learning deep representations by mutual information estimation and maximization
2018cited by this paper
Learning Latent Dynamics for Planning from Pixels
2018cited by this paper
Online abstraction with MDP homomorphisms for Deep Learning
2018cited by this paper
Why do deep convolutional networks generalize so poorly to small image transformations?
2018cited by this paper
Independently Controllable Factors
2017cited by this paper
Deep Reinforcement Learning: A Brief Survey
2017cited by this paper
A Rotation and a Translation Suffice: Fooling CNNs with Simple Transformations
2017cited by this paper
Symmetry Learning for Function Approximation in Reinforcement Learning
2017cited by this paper
Learning to Factor Policies and Action-Value Functions: Factored Action Space Representations for Deep Reinforcement learning
2017cited by this paper
Deep Reinforcement Learning that Matters
2017cited by this paper
Reinforcement Learning with Unsupervised Auxiliary Tasks
2016cited by this paper
Group Equivariant Convolutional Networks
2016influential reference
Near Optimal Behavior via Approximate State Abstraction
2016cited by this paper
Deep Reinforcement Learning in Large Discrete Action Spaces
2015cited by this paper
Continuous control with deep reinforcement learning
2015influential reference
Learning state representations with robotic priors
2015cited by this paper
Deterministic Policy Gradient Algorithms
2014influential reference
An Introduction to Reinforcement Learning
2013cited by this paper
Playing Atari with Deep Reinforcement Learning
2013cited by this paper
MuJoCo: A physics engine for model-based control
2012influential reference
Bisimulation Metrics for Continuous Markov Decision Processes
2011cited by this paper
Automatic Construction of Temporally Extended Actions for MDPs Using Bisimulation Metrics
2011cited by this paper
Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy
2010cited by this paper
Using Bisimulation for Policy Transfer in MDPs
2010cited by this paper
Learning to generalize and reuse skills using approximate partial policy homomorphisms
2009cited by this paper
Transfer via soft homomorphisms
2009cited by this paper
Bounding Performance Loss in Approximate MDP Homomorphisms
2008cited by this paper
Maximum Entropy Inverse Reinforcement Learning
2008cited by this paper
Defining Object Types and Options Using MDP Homomorphisms
2006cited by this paper
Statistical and Computational Inverse Problems
2006cited by this paper
Decision Tree Methods for Finding Reusable MDP Homomorphisms
2006cited by this paper
Methods for Computing State Similarity in Markov Decision Processes
2006cited by this paper
Towards a Unified Theory of State Abstraction for MDPs
2006cited by this paper
Using Homomorphisms to Transfer Options across Continuous Reinforcement Learning Domains
2006cited by this paper
Metrics for Markov Decision Processes with Infinite State Spaces
2005cited by this paper

CITED BY

Homomorphic Mappings for Value-Preserving State Aggregation in Markov Decision Processes
2025influential citation
Learning Temporal Abstractions via Variational Homomorphisms in Option-Induced Abstract MDPs
2025influential citation
Coordinated Humanoid Robot Locomotion with Symmetry Equivariant Reinforcement Learning Policy
2025cites this paper
Leveraging Symmetry to Accelerate Learning of Trajectory Tracking Controllers for Free-Flying Robotic Systems
2024influential citation
State Abstraction via Deep Supervised Hash Learning
2024cites this paper
No Prior Mask: Eliminate Redundant Action for Deep Reinforcement Learning
2023cites this paper
Extended Abstract Track
year unknowninfluential citation