In this paper, we present a novel Heavy-Tailed Stochastic Policy Gradient (HT-PSG) algorithm to deal with the challenges of sparse rewards in continuous control problems. Sparse rewards are common in continuous control robotics tasks such as manipulation and navigation and make the learning problem hard due to the non-trivial estimation of value functions over the state space. This demands either reward shaping or expert demonstrations for the sparse reward environment. However, obtaining high-quality demonstrations is quite expensive and sometimes even impossible. We propose a heavy-tailed policy parametrization along with a modified momentum-based policy gradient tracking scheme (HT-SPG) to induce a stable exploratory behavior in the algorithm. The proposed algorithm does not require access to expert demonstrations. We test the performance of HT-SPG on various benchmark tasks of continuous control with sparse rewards such as 1D Mario, Pathological Mountain Car, Sparse Pendulum in OpenAI Gym, and Sparse MuJoCo environments (Hopper-v2, Half-Cheetah, Walker-2D). We show consistent performance improvement across all tasks in terms of high average cumulative reward without requiring access to expert demonstrations. We further demonstrate that a navigation policy trained using HT-SPG can be easily transferred into a Clearpath Husky robot to perform real-world navigation tasks.
Dealing with Sparse Rewards in Continuous Control Robotics via Heavy-Tailed Policy Optimization
Souradip Chakraborty,A. S. Bedi,Kasun Weerakoon,Prithvi Poddar,Alec Koppel,Pratap Tokekar,Dinesh Manocha
Published 2023 in IEEE International Conference on Robotics and Automation
ABSTRACT
PUBLICATION RECORD
- Publication year
2023
- Venue
IEEE International Conference on Robotics and Automation
- Publication date
2023-05-29
- Fields of study
Computer Science, Engineering
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-50 of 50 references · Page 1 of 1
CITED BY
Showing 1-7 of 7 citing papers · Page 1 of 1