Self-supervised 3D Shape and Viewpoint Estimation from Single Images for Robotics

Oier Mees,Maxim Tatarchenko,T. Brox,Wolfram Burgard

Published 2019 in IEEE/RJS International Conference on Intelligent RObots and Systems

ABSTRACT

We present a convolutional neural network for joint 3D shape prediction and viewpoint estimation from a single input image. During training, our network gets the learning signal from a silhouette of an object in the input image-a form of self-supervision. It does not require ground truth data for 3D shapes and the viewpoints. Because it relies on such a weak form of supervision, our approach can easily be applied to real-world data. We demonstrate that our method produces reasonable qualitative and quantitative results on natural images for both shape estimation and viewpoint prediction. Unlike previous approaches, our method does not require multiple views of the same object instance in the dataset, which significantly expands the applicability in practical robotics scenarios. We showcase it by using the hallucinated shapes to improve the performance on the task of grasping real-world objects both in simulation and with a PR2 robot.

PUBLICATION RECORD

Publication year
2019
Venue
IEEE/RJS International Conference on Intelligent RObots and Systems
Publication date
2019-10-17
Fields of study
Computer Science, Engineering
Identifiers
DOI 10.1109/IROS40897.2019.8967916 arXiv 1910.07948
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

A Papier-Mache Approach to Learning 3D Surface Generation
2018cited by this paper
Learning Category-Specific Mesh Reconstruction from Image Collections
2018cited by this paper
Learning 6-DOF Grasping Interaction via Deep Geometry-Aware 3D Representations
2018cited by this paper
Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images
2018cited by this paper
Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling
2018cited by this paper
Matryoshka Networks: Predicting 3D Geometry via Nested Shape Layers
2018cited by this paper
Multi-view Consistency as Supervisory Signal for Learning Shape and Pose Prediction
2018influential reference
3D Shape Reconstruction from Sketches via Multi-view Convolutional Networks
2017cited by this paper
MarrNet: 3D Shape Reconstruction via 2.5D Sketches
2017cited by this paper
Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction
2017cited by this paper
Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs
2017cited by this paper
Multi-view Supervision for Single-View Reconstruction via Differentiable Ray Consistency
2017cited by this paper
Mask R-CNN
2017cited by this paper
Hierarchical Surface Prediction for 3D Object Reconstruction
2017cited by this paper
Metric learning for generalizing spatial relations to new objects
2017cited by this paper
3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction
2016cited by this paper
Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision
2016influential reference
3D Shape Induction from 2D Views of Multiple Objects
2016cited by this paper
High precision grasp pose detection in dense clutter
2016cited by this paper
Unsupervised Learning of 3D Structure from Images
2016cited by this paper
Learning a Predictable and Generative Vector Representation for Objects
2016cited by this paper
A Point Set Generation Network for 3D Object Reconstruction from a Single Image
2016cited by this paper
Shape completion enabled robotic grasping
2016cited by this paper
Spatial Transformer Networks
2015cited by this paper
ShapeNet: An Information-Rich 3D Model Repository
2015influential reference
Multi-view 3D Models from Single Images with a Convolutional Network
2015cited by this paper
Deep Residual Learning for Image Recognition
2015cited by this paper
3D ShapeNets: A deep representation for volumetric shapes
2014cited by this paper
3d Object Reconstruction from a Single Image.
2014cited by this paper
Towards Linear-Time Incremental Structure from Motion
2013cited by this paper
Graspit! A versatile simulator for robotic grasping
2004cited by this paper
A Morphable Model For The Synthesis Of 3D Faces
1999cited by this paper
Readings in computer vision: issues, problems, principles, and paradigms
1987cited by this paper
A computer algorithm for reconstructing a scene from two projections
1981cited by this paper

CITED BY

Single-view 3D reconstruction of surface of revolution
2025cites this paper
PAPRec: 3D Point Cloud Reconstruction Based on Prior-Guided Adaptive Probabilistic Network
2025cites this paper
MVBoost: Boost 3D Reconstruction with Multi-View Refinement
2024cites this paper
Human Multi-View Synthesis from a Single-View Model:Transferred Body and Face Representations
2024cites this paper
Category-level Object Detection, Pose Estimation and Reconstruction from Stereo Images
2024cites this paper
Self-supervised single-view 3D point cloud reconstruction through GAN inversion
2024cites this paper
Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability
2024cites this paper
Learning Versatile 3D Shape Generation with Improved AR Models
2023cites this paper
CARTO: Category and Joint Agnostic Reconstruction of ARTiculated Objects
2023cites this paper
Learning Versatile 3D Shape Generation with Improved Auto-regressive Models
2023cites this paper
One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape Optimization
2023cites this paper
Viewpoint Planning based on Shape Completion for Fruit Mapping and Reconstruction
2022cites this paper
Affordance Learning from Play for Sample-Efficient Policy Learning
2022cites this paper
Methodology of spatial modelling and visualisation of remains of the fortified Lusatian settlement in Biskupin based on archival data
2022cites this paper
Composing Pick-and-Place Tasks By Grounding Language
2021cites this paper
Self-Supervised Euphemism Detection and Identification for Content Moderation
2021cites this paper
Single-View 3D reconstruction: A Survey of deep learning methods
2021cites this paper
From Image Collections to Point Clouds With Self-Supervised Shape and Pose Networks
2020cites this paper
Tackling Two Challenges of 6D Object Pose Estimation: Lack of Real Annotated RGB Images and Scalability to Number of Objects
2020cites this paper
Learning Object Placements For Relational Instructions by Hallucinating Scene Representations
2020cites this paper
Self-Supervised 2D Image to 3D Shape Translation with Disentangled Representations
2020cites this paper
Introducing Pose Consistency and Warp-Alignment for Self-Supervised 6D Object Pose Estimation in Color Images
2020cites this paper
Hindsight for Foresight: Unsupervised Structured Dynamics Models from Physical Interaction
2020cites this paper
Adversarial Skill Networks: Unsupervised Robot Skill Learning from Video
2019cites this paper
Higher Order Function Networks for View Planning and Multi-View Reconstruction
2019cites this paper