Model-based Adversarial Imitation Learning from Demonstrations and Human Reward

Jie Huang,Jiangshan Hao,Rongshun Juan,Randy Gomez,Keisuke Nakarnura,Guangliang Li

Published 2023 in IEEE/RJS International Conference on Intelligent RObots and Systems

ABSTRACT

Reinforcement learning (RL) can potentially be applied to real-world robot control in complex and uncertain environments. However, it is difficult or even unpractical to design an efficient reward function for various tasks, especially those large and high-dimensional environments. Generative adversarial imitation learning (GAIL) - a general model-free imitation learning method, allows robots to directly learn policies from expert trajectories in large and high-dimensional environments. However, GAIL is still sample inefficient in terms of environmental interaction. In this paper, to solve this problem, we propose a model-based adversarial imitation learning from demonstrations and human reward (MAILDH), a novel model-based interactive imitation framework combining the advantages of GAIL, interactive RL and model-based RL. We tested our method in eight physics-based discrete and continuous control tasks for RL. Our results show that MAILDH can greatly improve the sample efficiency and robustness compared to the original GAIL.

PUBLICATION RECORD

Publication year
2023
Venue
IEEE/RJS International Conference on Intelligent RObots and Systems
Publication date
2023-10-01
Fields of study
Computer Science, Engineering
Identifiers
DOI 10.1109/IROS55552.2023.10341411
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

GAN-Based Interactive Reinforcement Learning from Demonstration and Human Evaluative Feedback
2021cited by this paper
Uncertainty-Aware Model-Based Reinforcement Learning with Application to Autonomous Driving
2021influential reference
Model-based versus Model-free Deep Reinforcement Learning for Autonomous Racing Cars
2021cited by this paper
A Model-Based Reinforcement Learning and Correction Framework for Process Control of Robotic Wire Arc Additive Manufacturing
2020cited by this paper
Keypoints into the Future: Self-Supervised Correspondence in Model-Based Reinforcement Learning
2020cited by this paper
Model-Based Meta-Reinforcement Learning for Flight With Suspended Payloads
2020cited by this paper
Brief Survey of Model-Based Reinforcement Learning Techniques
2020cited by this paper
Uncertainty-Aware Contact-Safe Model-Based Reinforcement Learning
2020cited by this paper
Human-Centered Reinforcement Learning: A Survey
2019cited by this paper
Benchmarking Model-Based Reinforcement Learning
2019cited by this paper
Deep Reinforcement Learning from Policy-Dependent Human Feedback
2019cited by this paper
Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow
2018cited by this paper
Generative Adversarial Imitation from Observation
2018cited by this paper
Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning
2018cited by this paper
Interactive Reinforcement Learning from Demonstration and Human Evaluative Feedback
2018cited by this paper
Model-Ensemble Trust-Region Policy Optimization
2018cited by this paper
Learning dexterous in-hand manipulation
2018cited by this paper
Mastering the game of Go without human knowledge
2017cited by this paper
Uncertainty-driven Imagination for Continuous Deep Reinforcement Learning
2017cited by this paper
Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces
2017cited by this paper
Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations
2017cited by this paper
Interactive Learning from Policy-Dependent Human Feedback
2017cited by this paper
Robust Imitation of Diverse Behaviors
2017cited by this paper
Towards Principled Methods for Training Generative Adversarial Networks
2017cited by this paper
End-to-End Differentiable Adversarial Imitation Learning
2017influential reference
Proximal Policy Optimization Algorithms
2017cited by this paper
Emergence of Locomotion Behaviours in Rich Environments
2017cited by this paper
Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning
2017cited by this paper
Model-Free Imitation Learning with Policy Optimization
2016cited by this paper
OpenAI Gym
2016cited by this paper
Generative Adversarial Imitation Learning
2016influential reference
Stochastic Neural Networks for Hierarchical Reinforcement Learning
2016cited by this paper
Learning behaviors via human-delivered discrete feedback: modeling implicit feedback strategies to speed up learning
2015cited by this paper
Learning Large-Scale Dynamic Discrete Choice Models of Spatio-Temporal Preferences with Application to Migratory Pastoralism in East Africa
2015cited by this paper
Human-level control through deep reinforcement learning
2015cited by this paper
Trust Region Policy Optimization
2015cited by this paper
Infinite time horizon maximum causal entropy inverse reinforcement learning
2014cited by this paper
Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation
2014cited by this paper
Reinforcement learning in robotics: A survey
2013cited by this paper
Agnostic System Identification for Model-Based Reinforcement Learning
2012cited by this paper
RTMBA: A Real-Time Model-Based Reinforcement Learning Architecture for robot control
2011cited by this paper
Efficient Reductions for Imitation Learning
2010cited by this paper
Interactively shaping agents via human reinforcement: the TAMER framework
2009cited by this paper
Apprenticeship learning for helicopter control
2009cited by this paper
Teachable robots: Understanding human teaching behavior to build more effective robot learners
2008cited by this paper
Algorithms for Inverse Reinforcement Learning
2000cited by this paper
Reinforcement Learning: An Introduction
1998cited by this paper
Q-learning
1992cited by this paper
Dyna, an integrated architecture for learning, planning, and reacting
1990cited by this paper

CITED BY

Pretraining Using Comparable Human Activities of Daily Living Dataset in Robotic Surgical Task Learning
2025cites this paper
A Survey of Human Intelligence Augmented Artificial Intelligence: An Autonomous Driving Perspective
2025cites this paper
Towards Engaging Teaching Interfaces for Mobile Robots: Preliminary System Design and Insights on Usability *
2025cites this paper