Policy Teaching via Environment Poisoning: Training-time Adversarial Attacks against Reinforcement Learning

Amin Rakhsha,Goran Radanovic,Rati Devidze,Xiaojin Zhu,A. Singla

Published 2020 in International Conference on Machine Learning

ABSTRACT

We study a security threat to reinforcement learning where an attacker poisons the learning environment to force the agent into executing a target policy chosen by the attacker. As a victim, we consider RL agents whose objective is to find a policy that maximizes average reward in undiscounted infinite-horizon problem settings. The attacker can manipulate the rewards or the transition dynamics in the learning environment at training-time and is interested in doing so in a stealthy manner. We propose an optimization framework for finding an \emph{optimal stealthy attack} for different measures of attack cost. We provide sufficient technical conditions under which the attack is feasible and provide lower/upper bounds on the attack cost. We instantiate our attacks in two settings: (i) an \emph{offline} setting where the agent is doing planning in the poisoned environment, and (ii) an \emph{online} setting where the agent is learning a policy using a regret-minimization framework with poisoned feedback. Our results show that the attacker can easily succeed in teaching any target policy to the victim under mild conditions and highlight a significant security threat to reinforcement learning agents in practice.

PUBLICATION RECORD

Publication year
2020
Venue
International Conference on Machine Learning
Publication date
2020-03-28
Fields of study
Mathematics, Computer Science
Identifiers
arXiv 2003.12909
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Optimistic posterior sampling for reinforcement learning: worst-case regret bounds
2022cited by this paper
Understanding the Power and Limitations of Teaching with Imperfect Knowledge
2020cited by this paper
Reinforcement Learning for Cyber-Physical Systems
2019cited by this paper
A Survey on Transfer Learning for Multiagent Reinforcement Learning Systems
2019cited by this paper
Policy Poisoning in Batch Reinforcement Learning and Control
2019influential reference
Adversarial machine learning
2019cited by this paper
Deceptive Reinforcement Learning Under Adversarial Manipulations on Cost Signals
2019cited by this paper
Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints
2019cited by this paper
Interactive Teaching Algorithms for Inverse Reinforcement Learning
2019cited by this paper
Preference-Based Batch and Sequential Teaching: Towards a Unified View of Models
2019cited by this paper
Data Poisoning Attacks on Stochastic Bandits
2019cited by this paper
Sequential Attacks on Agents for Long-Term Adversarial Goals
2018cited by this paper
Machine Teaching of Active Sequential Learners
2018cited by this paper
Understanding the Role of Adaptivity in Machine Teaching: The Case of Version Space Learners
2018cited by this paper
An Overview of Machine Teaching
2018cited by this paper
Teaching Inverse Reinforcement Learners via Features and Demonstrations
2018cited by this paper
Data Poisoning Attacks in Contextual Bandits
2018cited by this paper
Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications
2018cited by this paper
An Algorithmic Perspective on Imitation Learning
2018cited by this paper
Stronger data poisoning attacks break data sanitization defenses
2018cited by this paper
An Optimal Control View of Adversarial Machine Learning
2018cited by this paper
Adversarial Attacks on Stochastic Bandits
2018cited by this paper
Tactics of Adversarial Attack on Deep Reinforcement Learning Agents
2017cited by this paper
Adversarial Attacks on Neural Network Policies
2017cited by this paper
Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning
2017cited by this paper
Data Poisoning Attacks on Factorization-Based Collaborative Filtering
2016cited by this paper
Cooperative Inverse Reinforcement Learning
2016cited by this paper
Data Poisoning Attacks against Autoregressive Models
2016cited by this paper
Machine Teaching: An Inverse Problem to Machine Learning and an Approach Toward Optimal Education
2015cited by this paper
Human-level control through deep reinforcement learning
2015cited by this paper
Is Feature Selection Secure against Training Data Poisoning?
2015cited by this paper
Trust Region Policy Optimization
2015cited by this paper
Using Machine Teaching to Identify Optimal Training-Set Attacks on Machine Learners
2015cited by this paper
Near-Optimally Teaching the Crowd to Classify
2014cited by this paper
Simple and Scalable Response Prediction for Display Advertising
2014cited by this paper
On Actively Teaching the Crowd to Classify
2013cited by this paper
Poisoning Attacks against Support Vector Machines
2012cited by this paper
Dynamic Teaching in Sequential Decision Making Environments
2012cited by this paper
Algorithmic and Human Teaching of Sequential Decision Tasks
2012cited by this paper
What's Clicking What? Techniques and Innovations of Today's Clickbots
2011cited by this paper
Adversarial Machine Learning
2011cited by this paper
A contextual-bandit approach to personalized news article recommendation
2010cited by this paper
Policy teaching through reward function learning
2009cited by this paper
Near-optimal Regret Bounds for Reinforcement Learning
2008influential reference
Potential-based Shaping in Model-based Reinforcement Learning
2008cited by this paper
Value-Based Policy Teaching with Active Indirect Elicitation
2008cited by this paper
Interactive robot task training through dialog and demonstration
2007cited by this paper
Logarithmic Online Regret Bounds for Undiscounted Reinforcement Learning
2006influential reference
Experts in a Markov Decision Process
2004cited by this paper
Average reward reinforcement learning: Foundations, algorithms, and empirical results
2004influential reference
Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping
1999cited by this paper
Reinforcement Learning: An Introduction
1998cited by this paper
Markov Decision Processes: Discrete Stochastic Dynamic Programming
1994influential reference
H-Learning: A Reinforcement Learning Method for Optimizing Undiscounted Average Reward
1994cited by this paper
Automatic Programming of Behavior-Based Robots Using Reinforcement Learning
1991cited by this paper
On the complexity of teaching
1991cited by this paper

CITED BY

Unleashing the predators: Autonomous predation and manipulation through algorithms
2026cites this paper
HPA: Manipulating deep reinforcement learning via adversarial interaction
2026cites this paper
Nonparametric Teaching of Attention Learners
2026cites this paper
LLM-enabled Applications Require System-Level Threat Monitoring
2026cites this paper
SUNRISE: multi-agent reinforcement learning via neighbors' observations under fully noisy environments
2025cites this paper
Empowering artificial intelligence with homomorphic encryption for secure deep reinforcement learning
2025cites this paper
Exposing Vulnerabilities in RL: A Novel Stealthy Backdoor Attack through Reward Poisoning
2025cites this paper
Universal Black-Box Targeted Reward Poisoning Attack Against Online Deep Reinforcement Learning
2025cites this paper
A traceless backdoor attack in offline reinforcement learning
2025cites this paper
Provably Invincible Adversarial Attacks on Reinforcement Learning Systems: A Rate-Distortion Information-Theoretic Approach
2025influential citation
Pruning Cannot Hurt Robustness: Certified Trade-offs in Reinforcement Learning
2025cites this paper
Constrained Black-Box Attacks Against Cooperative Multi-Agent Reinforcement Learning
2025cites this paper
Action Robust Reinforcement Learning via Optimal Adversary Aware Policy Optimization
2025cites this paper
Beyond Training-time Poisoning: Component-level and Post-training Backdoors in Deep Reinforcement Learning
2025cites this paper
SoK: Private Knowledge Sharing in Distributed Learning
2025cites this paper
Policy Disruption in Reinforcement Learning:Adversarial Attack with Large Language Models and Critical State Identification
2025cites this paper
Adversarial attacks on deep reinforcement learning applications in electric vehicle charging scheduling: A dual-stage attack framework
2025cites this paper
Off-Policy Actor-Critic for Adversarial Observation Robustness: Virtual Alternative Training via Symmetric Policy Evaluation
2025cites this paper
Can In-Context Reinforcement Learning Recover From Reward Poisoning Attacks?
2025cites this paper
Fox in the Henhouse: Supply-Chain Backdoor Attacks Against Reinforcement Learning
2025cites this paper
Nonparametric Teaching for Graph Property Learners
2025cites this paper
Policy Teaching via Data Poisoning in Learning from Human Preferences
2025cites this paper
Optimally Installing Strict Equilibria
2025cites this paper
The Distributional Reward Critic Framework for Reinforcement Learning Under Perturbed Rewards
2024cites this paper
Data poisoning attacks on off-policy policy evaluation methods
2024cites this paper
Improve Robustness of Safe Reinforcement Learning Against Adversarial Attacks
2024cites this paper
Online Poisoning Attack Against Reinforcement Learning under Black-box Environments
2024influential citation
Reward Poisoning on Federated Reinforcement Learning
2024cites this paper
When Rewards Deceive: Counteracting Reward Poisoning on Online Deep Reinforcement Learning
2024cites this paper
Backdoor Attacks on Safe Reinforcement Learning-Enabled Cyber–Physical Systems
2024cites this paper
Promoting or Hindering: Stealthy Black-Box Attacks Against DRL-Based Traffic Signal Control
2024cites this paper
Defending Against Unknown Corrupted Agents: Reinforcement Learning of Adversarially Robust Nash Equilibria
2024cites this paper
Safe Multi-Agent Reinforcement Learning for Wireless Applications Against Adversarial Communications
2024cites this paper
Mitigating Deep Reinforcement Learning Backdoors in the Neural Activation Space
2024cites this paper
Inception: Efficiently Computable Misinformation Attacks on Markov Games
2024cites this paper
Robust Deep Reinforcement Learning against Adversarial Behavior Manipulation
2024cites this paper
SleeperNets: Universal Backdoor Poisoning Attacks Against Reinforcement Learning Agents
2024cites this paper
Adaptive Discounting of Training Time Attacks
2024influential citation
Nonparametric Teaching of Implicit Neural Representations
2024cites this paper
Policy Iteration for Pareto-Optimal Policies in Stochastic Stackelberg Games
2024cites this paper
Robustness of Updatable Learning-based Index Advisors against Poisoning Attack
2024cites this paper
Corruption-Robust Offline Two-Player Zero-Sum Markov Games
2024cites this paper
Beyond Worst-case Attacks: Robust RL with Adaptive Defense via Non-dominated Policies
2024cites this paper
Universal Black-Box Reward Poisoning Attack against Offline Reinforcement Learning
2024cites this paper
Corruption Robust Offline Reinforcement Learning with Human Feedback
2024cites this paper
Informativeness of Reward Functions in Reinforcement Learning
2024cites this paper
Assessing the Impact of Distribution Shift on Reinforcement Learning Performance
2024cites this paper
Security and Privacy Issues in Deep Reinforcement Learning: Threats and Countermeasures
2024cites this paper
Optimally Teaching a Linear Behavior Cloning Agent
2023cites this paper
Learning key steps to attack deep reinforcement learning agents
2023cites this paper
Action Poisoning Attacks on Linear Contextual Bandits
2023cites this paper
Policy Resilience to Environment Poisoning Attacks on Reinforcement Learning
2023cites this paper
Attacking Cooperative Multi-Agent Reinforcement Learning by Adversarial Minority Influence
2023cites this paper
Implicit Poisoning Attacks in Two-Agent Reinforcement Learning: Adversarial Policies for Training-Time Attacks
2023influential citation
Local Environment Poisoning Attacks on Federated Reinforcement Learning
2023cites this paper
Reclaiming the Digital Commons: A Public Data Trust for Training Data
2023cites this paper
Connected Superlevel Set in (Deep) Reinforcement Learning and its Application to Minimax Theorems
2023cites this paper
Policy Poisoning in Batch Learning for Linear Quadratic Control Systems via State Manipulation
2023cites this paper
Teaching Reinforcement Learning Agents via Reinforcement Learning
2023cites this paper
ATS-O2A: A state-based adversarial attack strategy on deep reinforcement learning
2023cites this paper
Black-Box Targeted Reward Poisoning Attack Against Online Deep Reinforcement Learning
2023cites this paper
Nonparametric Iterative Machine Teaching
2023cites this paper
Data Poisoning to Fake a Nash Equilibrium in Markov Games
2023cites this paper
Adversarial Attacks Against Online Reinforcement Learning Agents in MDPs
2023cites this paper
Certifiably Robust Policy Learning against Adversarial Multi-Agent Communication
2023cites this paper
Adversarial Attacks Against Online Learning Agents
2023cites this paper
Efficient Adversarial Attacks on Online Multi-agent Reinforcement Learning
2023cites this paper
Multi-Environment Training Against Reward Poisoning Attacks on Deep Reinforcement Learning
2023cites this paper
Reward poisoning attacks in deep reinforcement learning based on exploration strategies
2023cites this paper
Poisoning the Well: Can We Simultaneously Attack a Group of Learning Agents?
2023cites this paper
Solving Non-Rectangular Reward-Robust MDPs via Frequency Regularization
2023cites this paper
Unleashing the Predators: Autonomous Predation and Manipulation Through Algorithms
2023cites this paper
Adversarial Attacks on Combinatorial Multi-Armed Bandits
2023cites this paper
An Intelligent Secure Adversarial Examples Detection Scheme in Heterogeneous Complex Environments
2023cites this paper
Minimally Modifying a Markov Game to Achieve Any Nash Equilibrium and Value
2023cites this paper
Nonparametric Teaching for Multiple Learners
2023cites this paper
Deep Reinforcement Learning for Autonomous Navigation on Duckietown Platform: Evaluation of Adversarial Robustness
2023cites this paper
BadRL: Sparse Targeted Backdoor Attack Against Reinforcement Learning
2023cites this paper
Principal-Agent Reward Shaping in MDPs
2023cites this paper
Ensemble Reinforcement Learning in Collision Avoidance to Enhance Decision-Making Reliability
2023cites this paper
Multi-Agent Reinforcement Learning for Wireless Networks Against Adversarial Communications
2023cites this paper
Admissible Policy Teaching through Reward Design
2022influential citation
Trusted AI in Multiagent Systems: An Overview of Privacy and Security for Distributed Learning
2022influential citation
Reward Poisoning Attacks on Offline Multi-Agent Reinforcement Learning
2022cites this paper
Targeted Adversarial Attacks on Deep Reinforcement Learning Policies via Model Checking
2022cites this paper
Security of Deep Reinforcement Learning for Autonomous Driving: A Survey
2022cites this paper
New challenges in reinforcement learning: a survey of security and privacy
2022influential citation
One4All: Manipulate one agent to poison the cooperative multi-agent reinforcement learning
2022cites this paper
Imitating Opponent to Win: Adversarial Policy Imitation Learning in Two-player Competitive Games
2022cites this paper
Iterative Teaching by Data Hallucination
2022cites this paper
Efficient Adversarial Training without Attacking: Worst-Case-Aware Robust Reinforcement Learning
2022cites this paper
Reward Poisoning Attacks on Offline Multi-Agent Reinforcement Learning
2022cites this paper
Machine Teaching
2022cites this paper
Efficient Reward Poisoning Attacks on Online Deep Reinforcement Learning
2022influential citation
Threats to Training: A Survey of Poisoning Attacks and Defenses on Machine Learning Systems
2022influential citation
Environment Design for Biased Decision Makers
2022influential citation
Spiking Pitch Black: Poisoning an Unknown Environment to Attack Unknown Reinforcement Learners
2022influential citation
Transferable Environment Poisoning: Training-time Attack on Reinforcement Learner with Limited Prior Knowledge
2022cites this paper
Wild Patterns Reloaded: A Survey of Machine Learning Security against Training Data Poisoning
2022cites this paper
Understanding the Limits of Poisoning Attacks in Episodic Reinforcement Learning
2022cites this paper