Learning to Walk in Minutes Using Massively Parallel Deep Reinforcement Learning

N. Rudin,David Hoeller,Philipp Reist,Marco Hutter

Published 2021 in Conference on Robot Learning

ABSTRACT

In this work, we present and study a training set-up that achieves fast policy generation for real-world robotic tasks by using massive parallelism on a single workstation GPU. We analyze and discuss the impact of different training algorithm components in the massively parallel regime on the final policy performance and training times. In addition, we present a novel game-inspired curriculum that is well suited for training with thousands of simulated robots in parallel. We evaluate the approach by training the quadrupedal robot ANYmal to walk on challenging terrain. The parallel approach allows training policies for flat terrain in under four minutes, and in twenty minutes for uneven terrain. This represents a speedup of multiple orders of magnitude compared to previous work. Finally, we transfer the policies to the real robot to validate the approach. We open-source our training code to help accelerate further research in the field of learned legged locomotion.

PUBLICATION RECORD

Publication year
2021
Venue
Conference on Robot Learning
Publication date
2021-09-24
Fields of study
Computer Science, Engineering
Identifiers
arXiv 2109.11978
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Large Batch Simulation for Deep Reinforcement Learning
2021cited by this paper
Real-Time Trajectory Adaptation for Quadrupedal Locomotion using Deep Reinforcement Learning
2021cited by this paper
Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning
2021cited by this paper
Brax - A Differentiable Physics Engine for Large Scale Rigid Body Simulation
2021cited by this paper
Blind Bipedal Stair Traversal via Sim-to-Real Reinforcement Learning
2021cited by this paper
Autonomous Spot: Long-Range Autonomous Exploration of Extreme Environments with Legged Locomotion
2020cited by this paper
Learning quadrupedal locomotion over challenging terrain
2020cited by this paper
ALLSTEPS: Curriculum‐driven Learning of Stepping Stone Skills
2020cited by this paper
Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions
2019cited by this paper
Learning agile and dynamic motor skills for legged robots
2019cited by this paper
DeepGait: Planning and Control of Quadrupedal Gaits Using Deep Reinforcement Learning
2019cited by this paper
Solving Rubik's Cube with a Robot Hand
2019cited by this paper
ANYmal in the Field: Solving Industrial Inspection of an Offshore HVDC Platform with a Quadrupedal Robot
2019cited by this paper
Per-Contact Iteration Method for Solving Contact Dynamics
2018cited by this paper
Accelerated Methods for Deep Reinforcement Learning
2018influential reference
GPU-Accelerated Robotic Simulation for Distributed Reinforcement Learning
2018influential reference
Automatic Goal Generation for Reinforcement Learning Agents
2017cited by this paper
Emergence of Locomotion Behaviours in Rich Environments
2017cited by this paper
Time Limits in Reinforcement Learning
2017cited by this paper
Proximal Policy Optimization Algorithms
2017cited by this paper
Self-Supervised Deep Reinforcement Learning with Generalized Computation Graphs for Robot Navigation
2017cited by this paper
Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates
2016cited by this paper
High-Dimensional Continuous Control Using Generalized Advantage Estimation
2015cited by this paper
MuJoCo: A physics engine for model-based control
2012cited by this paper
Where is the data? Why you cannot debate CPU vs. GPU performance without the answer
2011cited by this paper

CITED BY

Towards Torque-Driven Reinforcement Learning for Quadruped Locomotion
2026cites this paper
Toward Reliable Sim-to-Real Predictability for MoE-based Robust Quadrupedal Locomotion
2026influential citation
Learning Human-Like Badminton Skills for Humanoid Robots
2026cites this paper
Learning Humanoid Loco-manipulation with Constraints as Terminations
2026influential citation
PILOT: A Perceptive Integrated Low-level Controller for Loco-manipulation over Unstructured Scenes
2026cites this paper
Attention-Based Neural-Augmented Kalman Filter for Legged Robot State Estimation
2026cites this paper
DeFM: Learning Foundation Representations from Depth for Robotics
2026cites this paper
Why Look at It at All?: Vision-Free Multifingered Blind Grasping Using Uniaxial Fingertip Force Sensing
2026cites this paper
Align and Filter: Improving Performance in Asynchronous On-Policy RL
2026cites this paper
Learning Vision-Based Omnidirectional Navigation: A Teacher-Student Approach Using Monocular Depth Estimation
2026cites this paper
A Hybrid Autoencoder for Robust Heightmap Generation from Fused Lidar and Depth Data for Humanoid Robot Locomotion
2026cites this paper
Fauna Sprout: A lightweight, approachable, developer-ready humanoid robot
2026cites this paper
eGAIT: Multi-Skilled Policy for Energy-Efficient Gait Transitions
2026cites this paper
Improving Regret Approximation for Unsupervised Dynamic Environment Generation
2026cites this paper
AME-2: Agile and Generalized Legged Locomotion via Attention-Based Neural Map Encoding
2026influential citation
K-Myriad: Jump-starting reinforcement learning with unsupervised parallel agents
2026cites this paper
Flow Policy Gradients for Robot Control
2026cites this paper
ECO: Energy-Constrained Optimization With Reinforcement Learning for Humanoid Walking
2026cites this paper
AdaptManip: Learning Adaptive Whole-Body Object Lifting and Delivery with Online Recurrent State Estimation
2026cites this paper
Rethinking Policy Diversity in Ensemble Policy Gradient in Large-Scale Reinforcement Learning
2026cites this paper
Reward-Conditioned Reinforcement Learning
2026cites this paper
CMoE: Contrastive Mixture of Experts for Motion Control and Terrain Adaptation of Humanoid Robots
2026cites this paper
LocoVLM: Grounding Vision and Language for Adapting Versatile Legged Locomotion Policies
2026cites this paper
Feasibility-Guided Planning over Multi-Specialized Locomotion Policies
2026cites this paper
ZEST: Zero-shot Embodied Skill Transfer for Athletic Robot Control
2026cites this paper
GPO: Growing Policy Optimization for Legged Robot Locomotion and Whole-Body Control
2026cites this paper
Transformable Quadruped Wheelchair: Unified Walking and Wheeled Locomotion via Mode-Conditioned Policy Distillation
2026cites this paper
A systematic literature review: Generative artificial intelligence applications for ground mobile robot navigation
2026cites this paper
Online Behavior-Centric Adaptation for Bipedal Robot Sim-to-Real Transfer With Unmodeled Dynamics Mismatch
2026cites this paper
DemoBot: Efficient Learning of Bimanual Manipulation with Dexterous Hands From Third-Person Human Videos
2026cites this paper
DART: A state-aware online co-scheduling runtime for data-parallel training
2026cites this paper
Locomotion Beyond Feet
2026cites this paper
Efficiently Learning Robust Torque-Based Locomotion Through Reinforcement With Model-Based Supervision
2026cites this paper
Scaling Rough Terrain Locomotion with Automatic Curriculum Reinforcement Learning
2026cites this paper
Towards Bridging the Gap between Large-Scale Pretraining and Efficient Finetuning for Humanoid Control
2026cites this paper
World-Gymnast: Training Robots with Reinforcement Learning in a World Model
2026cites this paper
ALORE: Autonomous Large-Object Rearrangement with a Legged Manipulator
2026cites this paper
HiWET: Hierarchical World-Frame End-Effector Tracking for Long-Horizon Humanoid Loco-Manipulation
2026cites this paper
Agile asymmetric multi-legged locomotion: contact planning via geometric mechanics and spin model duality
2026cites this paper
ExtremControl: Low-Latency Humanoid Teleoperation with Direct Extremity Control
2026cites this paper
Expert-Guided Imitation for Learning Humanoid Loco-Manipulation from Motion Capture
2026cites this paper
Learning Thermal-Aware Locomotion Policies for an Electrically-Actuated Quadruped Robot
2026cites this paper
Look Forward to Walk Backward: Efficient Terrain Memory for Backward Locomotion with Forward Vision
2026cites this paper
Diffusion Policy through Conditional Proximal Policy Optimization
2026influential citation
Learning Hip Exoskeleton Control Policy via Predictive Neuromusculoskeletal Simulation
2026cites this paper
X-Loco: Towards Generalist Humanoid Locomotion Control via Synergetic Policy Distillation
2026cites this paper
Dynamic Bimanual Cloth Manipulation via Dynamic Movement Primitives and Reinforcement Learning
2026cites this paper
System Design of the Ultra Mobility Vehicle: A Driving, Balancing, and Jumping Bicycle Robot
2026cites this paper
APEX: Learning Adaptive High-Platform Traversal for Humanoid Robots
2026cites this paper
Learning Agile Quadrotor Flight in the Real World
2026cites this paper
Learning Soccer Skills for Humanoid Robots: A Progressive Perception-Action Framework
2026cites this paper
CMR: Contractive Mapping Embeddings for Robust Humanoid Locomotion on Unstructured Terrains
2026cites this paper
PolicyFlow: Policy Optimization with Continuous Normalizing Flow in Reinforcement Learning
2026cites this paper
RobotMover: Learning to Move Large Objects From Human Demonstrations
2025cites this paper
JAEGER: Dual-Level Humanoid Whole-Body Controller
2025cites this paper
Learning Diverse Natural Behaviors for Enhancing the Agility of Quadrupedal Robots
2025cites this paper
Inner–Outer Loop Intelligent Morphology Optimization and Pursuit–Evasion Control for Space Modular Robot
2025cites this paper
ADD: Physics-Based Motion Imitation with Adversarial Differential Discriminators
2025influential citation
Drive Fast, Learn Faster: On-Board RL for High Performance Autonomous Racing
2025cites this paper
HuB: Learning Extreme Humanoid Balance
2025cites this paper
DAPPER: Discriminability-Aware Policy-to-Policy Preference-Based Reinforcement Learning for Query-Efficient Robot Skill Acquisition
2025influential citation
MA-ROESL: Motion-aware Rapid Reward Optimization for Efficient Robot Skill Learning from Single Videos
2025cites this paper
Towards Embodiment Scaling Laws in Robot Locomotion
2025cites this paper
Otto—Design and Control of an 8-DoF SEA-Driven Quadrupedal Robot
2025influential citation
RobotxR1: Enabling Embodied Robotic Intelligence on Large Language Models through Closed-Loop Reinforcement Learning
2025cites this paper
Understanding Human Behaviour in a Robotic Guide Dog Using Parallel Deep Reinforcement Learning
2025cites this paper
ReGen: Generative Robot Simulation via Inverse Design
2025cites this paper
APEX: Action Priors Enable Efficient Exploration for Skill Imitation on Articulated Robots
2025cites this paper
Enhancing Diversity in Parallel Agents: A Maximum State Entropy Exploration Story
2025cites this paper
MULE - Multi-Terrain and Unknown Load Adaptation for Effective Quadrupedal Locomotion
2025cites this paper
Robust Localization, Mapping, and Navigation for Quadruped Robots
2025cites this paper
Learned Perceptive Forward Dynamics Model for Safe and Platform-aware Robotic Navigation
2025cites this paper
High-Performance Reinforcement Learning on Spot: Optimizing Simulation Parameters with Distributional Measures
2025cites this paper
STDArm: Transferring Visuomotor Policies From Static Data Training to Dynamic Robot Manipulation
2025cites this paper
TWIST: Teleoperated Whole-Body Imitation System
2025cites this paper
PPF: Pre-Training and Preservative Fine-Tuning of Humanoid Locomotion via Model-Assumption-Based Regularization
2025cites this paper
Teacher Motion Priors: Enhancing Robot Locomotion over Challenging Terrain
2025cites this paper
Practical Reinforcement Learning Using Time-Efficient Model-Based Policy Optimization
2025cites this paper
Safety supervision framework for legged robots through safety verification and fall protection
2025cites this paper
Rambo: RL-Augmented Model-Based Whole-Body Control for Loco-Manipulation
2025cites this paper
RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning
2025cites this paper
Dynamic Legged Ball Manipulation on Rugged Terrains with Hierarchical Reinforcement Learning
2025cites this paper
Visual Imitation Enables Contextual Humanoid Control
2025cites this paper
Robust Reinforcement Learning-Based Locomotion for Resource-Constrained Quadrupeds with Exteroceptive Sensing
2025influential citation
Bio-inspired neural networks with central pattern generators for learning multi-skill locomotion
2025cites this paper
1000 Layer Networks for Self-Supervised RL: Scaling Depth Can Enable New Goal-Reaching Capabilities
2025cites this paper
Gait Planning, and Motion Control Methods for Quadruped Robots: Achieving High Environmental Adaptability: A Review
2025cites this paper
DUNE: Sim2Real Transfer for Depth-based Navigation in Unstructured Dynamic Indoor Environments
2025cites this paper
LEVA: A High-Mobility Logistic Vehicle with Legged Suspension
2025cites this paper
Dynamic Obstacle Avoidance with Bounded Rationality Adversarial Reinforcement Learning
2025cites this paper
Residual Policy Gradient: A Reward View of KL-regularized Objective
2025cites this paper
Bio-Inspired Plastic Neural Networks for Zero-Shot Out-of-Distribution Generalization in Complex Animal-Inspired Robots
2025cites this paper
Parental Guidance: Efficient Lifelong Learning through Evolutionary Distillation
2025cites this paper
Transferable Latent-To-Latent Locomotion Policy for Efficient and Versatile Motion Control of Diverse Legged Robots
2025cites this paper
Hierarchical reinforcement learning for enhancing stability and adaptability of hexapod robots in complex terrains
2025cites this paper
Bridge the Gap: Enhancing Quadruped Locomotion with Vertical Ground Perturbations
2025influential citation
Sim-to-Real of Humanoid Locomotion Policies via Joint Torque Space Perturbation Injection
2025cites this paper
Bridging Deep Reinforcement Learning and Motion Planning for Model-Free Navigation in Cluttered Environments
2025cites this paper
A Unified and General Humanoid Whole-Body Controller for Versatile Locomotion
2025cites this paper
Unified Locomotion Transformer with Simultaneous Sim-to-Real Transfer for Quadrupeds
2025influential citation