We present a convolutional neural network for joint 3D shape prediction and viewpoint estimation from a single input image. During training, our network gets the learning signal from a silhouette of an object in the input image-a form of self-supervision. It does not require ground truth data for 3D shapes and the viewpoints. Because it relies on such a weak form of supervision, our approach can easily be applied to real-world data. We demonstrate that our method produces reasonable qualitative and quantitative results on natural images for both shape estimation and viewpoint prediction. Unlike previous approaches, our method does not require multiple views of the same object instance in the dataset, which significantly expands the applicability in practical robotics scenarios. We showcase it by using the hallucinated shapes to improve the performance on the task of grasping real-world objects both in simulation and with a PR2 robot.
Self-supervised 3D Shape and Viewpoint Estimation from Single Images for Robotics
Oier Mees,Maxim Tatarchenko,T. Brox,Wolfram Burgard
Published 2019 in IEEE/RJS International Conference on Intelligent RObots and Systems
ABSTRACT
PUBLICATION RECORD
- Publication year
2019
- Venue
IEEE/RJS International Conference on Intelligent RObots and Systems
- Publication date
2019-10-17
- Fields of study
Computer Science, Engineering
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-34 of 34 references · Page 1 of 1
CITED BY
Showing 1-25 of 25 citing papers · Page 1 of 1