HDCNet: A Hybrid Depth Completion Network for Grasping Transparent and Reflective Objects

Guanghu Xie,Mingxuan Li,Songwei Wu,Yang Liu,Zongwu Xie,Baoshi Cao,Hong Liu

Published 2025 in arXiv.org

ABSTRACT

Depth perception of transparent and reflective objects has long been a critical challenge in robotic manipulation.Conventional depth sensors often fail to provide reliable measurements on such surfaces, limiting the performance of robots in perception and grasping tasks. To address this issue, we propose a novel depth completion network,HDCNet,which integrates the complementary strengths of Transformer,CNN and Mamba architectures.Specifically,the encoder is designed as a dual-branch Transformer-CNN framework to extract modality-specific features. At the shallow layers of the encoder, we introduce a lightweight multimodal fusion module to effectively integrate low-level features. At the network bottleneck,a Transformer-Mamba hybrid fusion module is developed to achieve deep integration of high-level semantic and global contextual information, significantly enhancing depth completion accuracy and robustness. Extensive evaluations on multiple public datasets demonstrate that HDCNet achieves state-of-the-art(SOTA) performance in depth completion tasks.Furthermore,robotic grasping experiments show that HDCNet substantially improves grasp success rates for transparent and reflective objects,achieving up to a 60% increase.

PUBLICATION RECORD

Publication year
2025
Venue
arXiv.org
Publication date
2025-11-10
Fields of study
Computer Science, Engineering
Identifiers
DOI 10.48550/arXiv.2511.07081 arXiv 2511.07081
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

TCRNet: Transparent Object Depth Completion With Cascade Refinements
2025cited by this paper
DCDU-VMamba: Depth Completion of Transparent Objects With Dual U-VMamba for Robotic Grasp
2025cited by this paper
GAA-TSO: Geometry-Aware Assisted Depth Completion for Transparent and Specular Objects
2025cited by this paper
TDCNet: Transparent Objects Depth Completion with CNN-Transformer Dual-Branch Parallel Network
2024cited by this paper
Transparent Depth Completion Using Segmentation Features
2024cited by this paper
DistillGrasp: Integrating Features Correlation With Knowledge Distillation for Depth Completion of Transparent Objects
2024cited by this paper
Diffusion-Based Depth Inpainting for Transparent and Reflective Objects
2024cited by this paper
Transparent Object Depth Perception Network for Robotic Manipulation Based on Orientation-Aware Guidance and Texture Enhancement
2024cited by this paper
Segment, Lift and Fit: Automatic 3D Shape Labeling from 2D Prompts
2024cited by this paper
Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction
2024cited by this paper
FDCT: Fast Depth Completion for Transparent Objects
2023cited by this paper
Learning Depth Estimation for Transparent and Mirror Surfaces
2023cited by this paper
A survey of the vision transformers and their CNN-transformer based variants
2023cited by this paper
Optimized design and characterization of a non-linear 3D misalignment measurement system
2022cited by this paper
View and Scanning-Depth Expansion Photographic Microscope Using Ultrafast Switching Mirrors
2022cited by this paper
TransCG: A Large-Scale Real-World Dataset for Transparent Object Depth Completion and A Grasping Baseline
2022cited by this paper
Domain Randomization-Enhanced Depth Simulation and Restoration for Perceiving and Grasping Specular and Transparent Objects
2022cited by this paper
Data-driven robotic visual grasping detection for unknown objects: A problem-oriented review
2022cited by this paper
TODE-Trans: Transparent Object Depth Estimation with Transformer
2022cited by this paper
Stitching Based on Corrections to Obtain a Flat Image on a Curved-Edge OLED Display
2022cited by this paper
GraspVDN: scene-oriented grasp estimation by learning vector representations of grasps
2021cited by this paper
TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation
2021cited by this paper
Mobile-Former: Bridging MobileNet and Transformer
2021cited by this paper
DepthGrasp: Depth Completion of Transparent Objects Using Self-Attentive Adversarial Network with Spectral Residual for Grasping
2021cited by this paper
Seeing Glass: Joint Point Cloud and Depth Completion for Transparent Objects
2021cited by this paper
Depth Completion using Plane-Residual Representation
2021cited by this paper
RGB-D Local Implicit Function for Depth Completion of Transparent Objects
2021cited by this paper
Evolving Attention with Residual Convolutions
2021cited by this paper
Depth Completion via Inductive Fusion of Planar LIDAR and Monocular Camera
2020cited by this paper
Robust Plane Detection Using Depth Information From a Consumer Depth Camera
2019cited by this paper
Clear Grasp: 3D Shape Estimation of Transparent Objects for Manipulation
2019cited by this paper
Depth Estimation via Affinity Learned with Convolutional Spatial Propagation Network
2018cited by this paper
Full 3D reconstruction of transparent objects
2018cited by this paper
Estimating Depth from RGB and Sparse Sensing
2018cited by this paper
DeepLiDAR: Deep Surface Normal Guided Depth Prediction for Outdoor Scene From Sparse LiDAR Data and Single Color Image
2018cited by this paper
Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image
2017cited by this paper
3D Reconstruction of Transparent Objects with Position-Normal Consistency
2016cited by this paper
Depth Map Prediction from a Single Image using a Multi-Scale Deep Network
2014cited by this paper
Conference on Computer Vision and Pattern Recognition
2005cited by this paper

CITED BY

No citing papers are available for this paper.