Hierarchical Imitation and Reinforcement Learning

Hoang Minh Le,Nan Jiang,Alekh Agarwal,Miroslav Dudík,Yisong Yue,Hal Daumé

Published 2018 in International Conference on Machine Learning

ABSTRACT

We study how to effectively leverage expert feedback to learn sequential decision-making policies. We focus on problems with sparse rewards and long time horizons, which typically pose significant challenges in reinforcement learning. We propose an algorithmic framework, called hierarchical guidance, that leverages the hierarchical structure of the underlying problem to integrate different modes of expert interaction. Our framework can incorporate different combinations of imitation learning (IL) and reinforcement learning (RL) at different levels, leading to dramatic reductions in both expert effort and cost of exploration. Using long-horizon benchmarks, including Montezuma's Revenge, we demonstrate that our approach can learn significantly faster than hierarchical RL, and be significantly more label-efficient than standard IL. We also theoretically analyze labeling cost for certain instantiations of our framework.

PUBLICATION RECORD

Publication year
2018
Venue
International Conference on Machine Learning
Publication date
2018-03-01
Fields of study
Mathematics, Computer Science
Identifiers
arXiv 1803.00590
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Deep Abstract Q-Networks
2017cited by this paper
Exploration-Exploitation in MDPs with Options
2017cited by this paper
Frames: a corpus for adding memory to goal-oriented dialogue systems
2017cited by this paper
Overcoming Exploration in Reinforcement Learning with Demonstrations
2017cited by this paper
Deep Reinforcement Learning from Human Preferences
2017cited by this paper
Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential Prediction
2017influential reference
Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning
2017cited by this paper
FeUdal Networks for Hierarchical Reinforcement Learning
2017cited by this paper
Deep Q-learning From Demonstrations
2017cited by this paper
Generating Long-term Trajectories Using Deep Hierarchical Networks
2016cited by this paper
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation
2016influential reference
Modular Multitask Reinforcement Learning with Policy Sketches
2016cited by this paper
Generative Adversarial Imitation Learning
2016cited by this paper
Asynchronous Methods for Deep Reinforcement Learning
2016cited by this paper
Dueling Network Architectures for Deep Reinforcement Learning
2015cited by this paper
Universal Value Function Approximators
2015influential reference
Deep Reinforcement Learning with Double Q-Learning
2015influential reference
Deep Reinforcement Learning in Parameterized Action Space
2015cited by this paper
Learning behaviors via human-delivered discrete feedback: modeling implicit feedback strategies to speed up learning
2015cited by this paper
Learning to Search Better than Your Teacher
2015cited by this paper
Human-level control through deep reinforcement learning
2015influential reference
Prioritized Experience Replay
2015influential reference
Reinforcement and Imitation Learning via Interactive No-Regret Learning
2014cited by this paper
Preference-based reinforcement learning: a formal framework and a policy iteration algorithm
2012cited by this paper
Online Learning and Online Convex Optimization
2012cited by this paper
PUMA: Planning Under Uncertainty with Macro-Actions
2010cited by this paper
A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
2010influential reference
Search-based structured prediction
2009cited by this paper
Maximum Entropy Inverse Reinforcement Learning
2008cited by this paper
A Game-Theoretic Approach to Apprenticeship Learning
2007cited by this paper
Apprenticeship learning via inverse reinforcement learning
2004cited by this paper
Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning
1999influential reference
Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition
1999cited by this paper
Intra-Option Learning about Temporally Abstract Actions
1998cited by this paper
Feudal Reinforcement Learning
1992cited by this paper

CITED BY

AGI-Inspired Digital Twin Framework for UAV-BS Deployment in Disaster Communication Recovery
2026cites this paper
Task-Centric Policy Optimization from Misaligned Motion Priors
2026cites this paper
Textual Equilibrium Propagation for Deep Compound AI Systems
2026cites this paper
FeudalNav: A Simple Framework for Visual Navigation
2026cites this paper
Improving scalability of multi-agent deep reinforcement learning with suboptimal human knowledge
2026cites this paper
TextResNet: Decoupling and Routing Optimization Signals in Compound AI Systems via Deep Residual Tuning
2026cites this paper
Interpretable Deep Reinforcement Learning With Imitative Expert Experience for Smart Charging of Electric Vehicles
2025cites this paper
Filtering Human Demonstration Datasets to Improve Policy Learning for Robotic Manipulation
2025cites this paper
Intention-guided imitation learning methods under limited expert demonstration data
2025cites this paper
Hierarchical Imitation Learning of Team Behavior from Heterogeneous Demonstrations
2025cites this paper
Few-Shot Neuro-Symbolic Imitation Learning for Long-Horizon Planning and Acting
2025cites this paper
GTHSL: A Goal-Task-Driven Hierarchical Sharing Learning Method to Learn Long-Horizon Tasks Autonomously
2025cites this paper
Interactive imitation learning for dexterous robotic manipulation: challenges and perspectives—a survey
2025cites this paper
Hierarchical Quasimetric Reinforcement Learning
2025cites this paper
Multimodal demonstration knowledge guided robot skill hierarchical reinforcement learning for 3C assembly
2025cites this paper
Think Small, Act Big: Primitive Prompt Learning for Lifelong Robot Manipulation
2025cites this paper
UBG: An Unreal BattleGround Benchmark With Object-Aware Hierarchical Proximal Policy Optimization
2025cites this paper
Generating drawing/grinding trajectories based on hierarchical CVAE
2025cites this paper
Learning to See and Act: Task-Aware View Planning for Robotic Manipulation
2025cites this paper
Optimas: Optimizing Compound AI Systems with Globally Aligned Local Rewards
2025cites this paper
Automating Curriculum Learning for Reinforcement Learning using a Skill-Based Bayesian Network
2025cites this paper
Deep Q network with action retention for going long and short selling
2025cites this paper
Hierarchical Semantic RL: Tackling the Problem of Dynamic Action Space for RL-based Recommendations
2025cites this paper
Hierarchical Reinforcement Learning with Targeted Causal Interventions
2025cites this paper
IDIL: Imitation Learning of Intent-Driven Expert Behavior
2024cites this paper
Robot skill learning and the data dilemma it faces: a systematic review
2024cites this paper
Feudal Networks for Visual Navigation
2024cites this paper
Modeling Air Combat Behavior for Simulation-Based Pilot Training: A Survey of Machine Learning Approaches
2024cites this paper
Act Better by Timing: A timing-Aware Reinforcement Learning for Autonomous Driving
2024cites this paper
PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in Control
2024cites this paper
Interactive and Hybrid Imitation Learning: Provably Beating Behavior Cloning
2024influential citation
Offline Hierarchical Reinforcement Learning via Inverse Optimization
2024cites this paper
SampleViz: Concept based Sampling for Policy Refinement in Deep Reinforcement Learning
2024cites this paper
D2AH-PPO: Playing ViZDoom With Object-Aware Hierarchical Reinforcement Learning
2024cites this paper
Interpretable Robotic Manipulation from Language
2024cites this paper
Using LLMs for Augmenting Hierarchical Agents with Common Sense Priors
2024cites this paper
RL-Cache: An Efficient Reinforcement Learning Based Cache Partitioning Approach for Multi-Tenant CDN Services
2024cites this paper
Autoregressive Action Sequence Learning for Robotic Manipulation
2024cites this paper
Offline Imitation of Badminton Player Behavior via Experiential Contexts and Brownian Motion
2024cites this paper
ESC: Explaining the Predictions of Deep Neural Networks via Concept Similarity
2024cites this paper
Hierarchical Subspaces of Policies for Continual Offline Reinforcement Learning
2024influential citation
SEAL: SEmantic-Augmented Imitation Learning via Language Model
2024influential citation
Integrating Suboptimal Human Knowledge with Hierarchical Reinforcement Learning for Large-Scale Multiagent Systems
2024cites this paper
Overparametrization helps offline-to-online generalization of closed-loop control from pixels
2024cites this paper
Digital-Twin-Assisted Skill Learning for 3C Assembly Tasks
2024cites this paper
Mnemonic Dictionary Learning for Intrinsic Motivation in Reinforcement Learning
2023cites this paper
Deep Reinforcement Learning with Implicit Imitation for Lane-Free Autonomous Driving
2023cites this paper
Pseudo Value Network Distillation for High-Performance Exploration
2023influential citation
Multistage Cable Routing Through Hierarchical Imitation Learning
2023cites this paper
XSkill: Cross Embodiment Skill Discovery
2023cites this paper
Hybrid hierarchical learning for solving complex sequential tasks using the robotic manipulation network ROMAN
2023cites this paper
Int-HRL: towards intention-based hierarchical reinforcement learning
2023cites this paper
RObotic MAnipulation Network (ROMAN) - Hybrid Hierarchical Learning for Solving Complex Sequential Tasks
2023influential citation
NetHack is Hard to Hack
2023cites this paper
Human-in-the-Loop Task and Motion Planning for Imitation Learning
2023cites this paper
Disturbance Injection Under Partial Automation: Robust Imitation Learning for Long-Horizon Tasks
2023cites this paper
Seeing is not Believing: Robust Reinforcement Learning against Spurious Correlation
2023cites this paper
Guided Reinforcement Learning: A Review and Evaluation for Efficient and Effective Real-World Robotics [Survey]
2023influential citation
Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits
2023cites this paper
Lyapunov Constrained Safe Reinforcement Learning for Multicopter Visual Servoing
2023cites this paper
Hierarchical Adversarial Inverse Reinforcement Learning
2023cites this paper
Few-Shot Structured Policy Learning for Multi-Domain and Multi-Task Dialogues
2023cites this paper
LOTUS: Continual Imitation Learning for Robot Manipulation Through Unsupervised Skill Discovery
2023cites this paper
Progressively Efficient Learning
2023influential citation
CasIL: Cognizing and Imitating Skills via a Dual Cognition-Action Architecture
2023cites this paper
GO-DICE: Goal-Conditioned Option-Aware Offline Imitation Learning via Stationary Distribution Correction Estimation
2023cites this paper
Hierarchical Imitation Learning with Vector Quantized Models
2023cites this paper
LLM Augmented Hierarchical Agents
2023cites this paper
Hierarchical Imitation Learning for Stochastic Environments
2023cites this paper
Conceptual Framework for Autonomous Cognitive Entities
2023cites this paper
How to Train your Decision-Making AIs
2022cites this paper
Burst-Dependent Plasticity and Dendritic Amplification Support Target-Based Learning and Hierarchical Imitation Learning
2022cites this paper
Option-Aware Adversarial Inverse Reinforcement Learning for Robotic Control
2022cites this paper
Conditional Imitation Learning for Multi-Agent Games
2022cites this paper
SHAIL: Safety-Aware Hierarchical Adversarial Imitation Learning for Autonomous Driving in Urban Environments
2022cites this paper
Demonstration-Bootstrapped Autonomous Practicing via Multi-Task Reinforcement Learning
2022cites this paper
Teachable Reinforcement Learning via Advice Distillation
2022cites this paper
A Hierarchical Bayesian Approach to Inverse Reinforcement Learning with Symbolic Reward Machines
2022influential citation
Beyond spiking networks: The computational advantages of dendritic amplification and input segregation
2022cites this paper
Hierarchical Agents by Combining Language Generation and Semantic Goal Directed RL
2022cites this paper
Squeezing more value out of your historical data: data-augmented behavioural cloning as launchpad for reinforcement learning
2022cites this paper
AI Planning Annotation for Sample Efficient Reinforcement Learning
2022influential citation
Model-based Trajectory Stitching for Improved Offline Reinforcement Learning
2022cites this paper
ReLiable: Offline Reinforcement Learning for Tactical Strategies in Professional Basketball Games
2022cites this paper
Towards an Interpretable Hierarchical Agent Framework using Semantic Goals
2022cites this paper
Are All Vision Models Created Equal? A Study of the Open-Loop to Closed-Loop Causality Gap
2022cites this paper
Model-based trajectory stitching for improved behavioural cloning and its applications
2022cites this paper
FIRL: Fast Imitation and Policy Reuse Learning
2022cites this paper
Hierarchical Reinforcement Learning: A Survey and Open Research Challenges
2022cites this paper
Incremental Learning through Probabilistic Behavior Prediction
2022cites this paper
Linear Last-Iterate Convergence for Continuous Games with Coupled Inequality Constraints
2022cites this paper
BITS: Bi-level Imitation for Traffic Simulation
2022cites this paper
Dual Representation for Human-in-the-loop Robot Learning
2022cites this paper
Efficient Reinforcement Learning from Demonstration Using Local Ensemble and Reparameterization with Split and Merge of Expert Policies
2022cites this paper
Policy Optimization with Linear Temporal Logic Constraints
2022cites this paper
ASC me to Do Anything: Multi-task Training for Embodied AI
2022cites this paper
Goal-Aware Generative Adversarial Imitation Learning from Imperfect Demonstration for Robotic Cloth Manipulation
2022cites this paper
Priors, Hierarchy, and Information Asymmetry for Skill Transfer in Reinforcement Learning
2022cites this paper
Learning Sensorimotor Primitives of Sequential Manipulation Tasks from Visual Demonstrations
2022cites this paper
Inverse Reinforcement Learning with Hybrid-weight Trust-region Optimization and Curriculum Learning for Autonomous Maneuvering
2022cites this paper