Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings

Jesse Zhang,Brian Cheung,Chelsea Finn,S. Levine,Dinesh Jayaraman

Published 2020 in International Conference on Machine Learning

ABSTRACT

Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous, imperiling the RL agent, other agents, and the environment. To overcome this difficulty, we propose a "safety-critical adaptation" task setting: an agent first trains in non-safety-critical "source" environments such as in a simulator, before it adapts to the target environment where failures carry heavy costs. We propose a solution approach, CARL, that builds on the intuition that prior experience in diverse environments equips an agent to estimate risk, which in turn enables relative safety through risk-averse, cautious adaptation. CARL first employs model-based RL to train a probabilistic model to capture uncertainty about transition dynamics and catastrophic states across varied source environments. Then, when exploring a new safety-critical environment with unknown dynamics, the CARL agent plans to avoid actions that could lead to catastrophic states. In experiments on car driving, cartpole balancing, half-cheetah locomotion, and robotic object manipulation, CARL successfully acquires cautious exploration behaviors, yielding higher rewards with fewer failures than strong RL adaptation baselines. Website at this https URL.

PUBLICATION RECORD

Publication year
2020
Venue
International Conference on Machine Learning
Publication date
2020-07-12
Fields of study
Mathematics, Computer Science, Engineering, Psychology
Identifiers
arXiv 2008.06622
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Risk-Aware Motion Planning and Control Using CVaR-Constrained Optimization
2019cited by this paper
Non-Stationary Markov Decision Processes a Worst-Case Approach using Model-Based Reinforcement Learning
2019cited by this paper
Model-based Deep Reinforcement Learning for Dynamic Portfolio Optimization
2019cited by this paper
Deep Dynamics Models for Learning Dexterous Manipulation
2019influential reference
Model-Predictive Policy Learning with Uncertainty Regularization for Driving in Dense Traffic
2019cited by this paper
Learning to Adapt in Dynamic, Real-World Environments through Meta-Reinforcement Learning
2018cited by this paper
Meta Reinforcement Learning with Latent Variable Gaussian Processes
2018cited by this paper
Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
2018influential reference
Proximal Policy Optimization Algorithms
2017cited by this paper
Cautious Model Predictive Control Using Gaussian Process Regression
2017cited by this paper
Domain randomization for transferring deep neural networks from simulation to the real world
2017cited by this paper
Sim-to-Real Transfer of Robotic Control with Dynamics Randomization
2017cited by this paper
Uncertainty-Aware Reinforcement Learning for Collision Avoidance
2017cited by this paper
Safe Model-based Reinforcement Learning with Stability Guarantees
2017cited by this paper
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
2017influential reference
Robust Adversarial Reinforcement Learning
2017cited by this paper
A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems
2017cited by this paper
Robust Constrained Learning-based NMPC enabling reliable mobile robot path tracking
2016cited by this paper
EPOpt: Learning Robust Neural Network Policies Using Model Ensembles
2016cited by this paper
(CAD)$^2$RL: Real Single-Image Flight without a Single Real Image
2016cited by this paper
Safe Control under Uncertainty with Probabilistic Signal Temporal Logic
2016cited by this paper
Bayesian Reinforcement Learning: A Survey
2015cited by this paper
Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach
2015cited by this paper
Scenario-based MPC with gradual relaxation of output constraints
2015cited by this paper
Algorithms for CVaR Optimization in MDPs
2014cited by this paper
Optimizing the CVaR via Sampling
2014cited by this paper
Provably safe and robust learning-based model predictive control
2011cited by this paper
Risk-constrained Markov decision processes
2010cited by this paper
Nonparametric Return Distribution Approximation for Reinforcement Learning
2010cited by this paper
A Bayesian Framework for Reinforcement Learning
2000cited by this paper
Optimization of conditional value-at risk
2000cited by this paper

CITED BY

Laplacian Representations for Decision-Time Planning
2026cites this paper
Topological Robust Reinforcement Learning
2025cites this paper
TLXML: Task-Level Explanation of Meta-Learning via Influence Functions
2025cites this paper
Towards provable probabilistic safety for scalable embodied AI systems
2025cites this paper
Lyapunov-Inspired Deep Reinforcement Learning for Robot Navigation in Obstacle Environments
2025cites this paper
Hybrid Action Based Reinforcement Learning for Multi-Objective Compatible Autonomous Driving
2025cites this paper
A Schwarz-Christoffel Mapping-based Framework for Sim-to-Real Transfer in Autonomous Robot Operations
2025cites this paper
A meta-reinforcement learning method for adaptive payload transportation with variations
2025cites this paper
A collective AI via lifelong learning and sharing at the edge
2024cites this paper
Distributionally Robust Constrained Reinforcement Learning under Strong Duality
2024cites this paper
Monitored Markov Decision Processes
2024cites this paper
Causally aware reinforcement learning agents for autonomous cyber defence
2024cites this paper
A Safe Exploration Strategy for Model-Free Task Adaptation in Safety-Constrained Grid Environments
2024cites this paper
Robust Fast Adaptation from Adversarially Explicit Task Distribution Generation
2024cites this paper
Anomalous State Sequence Modeling to Enhance Safety in Reinforcement Learning
2024cites this paper
Enhancing Hardware Fault Tolerance in Machines with Reinforcement Learning Policy Gradient Algorithms
2024cites this paper
Generalized constraint for probabilistic safe reinforcement learning
2024influential citation
Adaptive Aggregation for Safety-Critical Control
2023cites this paper
Safely Learning Dynamical Systems
2023cites this paper
Probabilistic Constraint for Safety-Critical Reinforcement Learning
2023cites this paper
Model-Assisted Probabilistic Safe Adaptive Control With Meta-Bayesian Learning
2023cites this paper
Reinforcement Learning by Guided Safe Exploration
2023cites this paper
A Simple Yet Effective Strategy to Robustify the Meta Learning Paradigm
2023cites this paper
Safe and Efficient Robot Learning by Biasing Exploration Towards Expert Demonstrations
2023cites this paper
A TTENTIVE A BSTRACTIONS F OR F LEXIBLE V ISION -B ASED R OBOT L EARNERS
2023cites this paper
MoDem-V2: Visuo-Motor World Models for Real-World Robot Manipulation
2023cites this paper
Resilient Constrained Reinforcement Learning
2023cites this paper
Distributional Reinforcement Learning with Online Risk-awareness Adaption
2023cites this paper
DRL-ORA: Distributional Reinforcement Learning with Online Risk Adaption
2023cites this paper
Approximate Shielding of Atari Agents for Safe Exploration
2023cites this paper
MSVIPER: Improved Policy Distillation for Reinforcement-Learning-Based Robot Navigation
2022cites this paper
Learning Differentiable Safety-Critical Control using Control Barrier Functions for Generalization to Novel Environments
2022cites this paper
SafeAPT: Safe Simulation-to-Real Robot Learning Using Diverse Policies Learned in Simulation
2022cites this paper
Robust Policy Learning over Multiple Uncertainty Sets
2022cites this paper
A survey on model-based reinforcement learning
2022cites this paper
Safe Reinforcement Learning with Contrastive Risk Prediction
2022cites this paper
Effects of Safety State Augmentation on Safe Exploration
2022cites this paper
Toward Airworthiness Certification for Artificial Intelligence (AI) in Aerospace Systems
2022cites this paper
SMS-MPC: Adversarial Learning-based Simultaneous Prediction Control with Single Model for Mobile Robots
2022cites this paper
One Risk to Rule Them All: A Risk-Sensitive Perspective on Model-Based Offline Reinforcement Learning
2022cites this paper
Training and Transferring Safe Policies in Reinforcement Learning
2022cites this paper
LS3: Latent Space Safe Sets for Long-Horizon Visuomotor Control of Iterative Tasks
2021cites this paper
Safe Learning in Robotics: From Learning-Based Control to Safe Reinforcement Learning
2021influential citation
How to certify machine learning based safety-critical systems? A systematic literature review
2021cites this paper
LS 3 : Latent Space Safe Sets for Long-Horizon Visuomotor Control of Iterative Tasks
2021cites this paper
Safe Exploration by Solving Early Terminated MDP
2021cites this paper
Lyapunov Barrier Policy Optimization
2021cites this paper
Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning
2021cites this paper
MESA: Offline Meta-RL for Safe Adaptation and Fault Tolerance
2021influential citation
A survey on deep learning and deep reinforcement learning in robotics with a tutorial on deep reinforcement learning
2021influential citation
Look Before You Leap: Safe Model-Based Reinforcement Learning with Human Intervention
2021cites this paper
Risk Sensitive Model-Based Reinforcement Learning using Uncertainty Guided Planning
2021cites this paper
Learning to Be Cautious
2021cites this paper
Policy Search using Dynamic Mirror Descent MPC for Model Free Off Policy RL
2021cites this paper
LS3: Latent Space Safe Sets for Long-Horizon Visuomotor Control of Sparse Reward Iterative Tasks
2021cites this paper
A Conformal Mapping-based Framework for Robot-to-Robot and Sim-to-Real Transfer Learning
2021cites this paper
Learning to Synthesize Programs as Interpretable and Generalizable Policies
2021cites this paper
Safe Driving via Expert Guided Policy Optimization
2021cites this paper
Accelerating Safe Reinforcement Learning with Constraint-mismatched Baseline Policies
2020cites this paper
Safety Aware Reinforcement Learning (SARL)
2020influential citation
Reset-Free Lifelong Learning with Skill-Space Planning
2020cites this paper
C-Learning: Horizon-Aware Cumulative Accessibility Estimation
2020cites this paper
The IISc Thesis Template A
year unknowncites this paper