Multi-view 3D Object Detection Network for Autonomous Driving

Xiaozhi Chen,Huimin Ma,Ji Wan,Bo Li,Tian Xia

Published 2016 in Computer Vision and Pattern Recognition

ABSTRACT

This paper aims at high-accuracy 3D object detection in autonomous driving scenario. We propose Multi-View 3D networks (MV3D), a sensory-fusion framework that takes both LIDAR point cloud and RGB images as input and predicts oriented 3D bounding boxes. We encode the sparse 3D point cloud with a compact multi-view representation. The network is composed of two subnetworks: one for 3D object proposal generation and another for multi-view feature fusion. The proposal network generates 3D candidate boxes efficiently from the birds eye view representation of 3D point cloud. We design a deep fusion scheme to combine region-wise features from multiple views and enable interactions between intermediate layers of different paths. Experiments on the challenging KITTI benchmark show that our approach outperforms the state-of-the-art by around 25% and 30% AP on the tasks of 3D localization and 3D detection. In addition, for 2D detection, our approach obtains 14.9% higher AP than the state-of-the-art on the hard data among the LIDAR-based methods.

PUBLICATION RECORD

Publication year
2016
Venue
Computer Vision and Pattern Recognition
Publication date
2016-11-23
Fields of study
Computer Science, Engineering
Identifiers
DOI 10.1109/CVPR.2017.691 arXiv 1611.07759
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

On-Board Object Detection: Multicue, Multimodal, and Multiview Random Forest of Local Experts
2017cited by this paper
Volumetric and Multi-view CNNs for Object Classification on 3D Data
2016cited by this paper
Vote3Deep: Fast object detection in 3D point clouds using efficient convolutional neural networks
2016cited by this paper
A Continuous Occlusion Model for Road Scene Understanding
2016cited by this paper
Learning with Side Information through Modality Hallucination
2016cited by this paper
FractalNet: Ultra-Deep Neural Networks without Residuals
2016cited by this paper
3D Object Proposals Using Stereo Imagery for Accurate Object Class Detection
2016cited by this paper
Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers
2016cited by this paper
Monocular 3D Object Detection for Autonomous Driving
2016influential reference
Deeply-Fused Nets
2016cited by this paper
3D fully convolutional network for vehicle detection in point cloud
2016influential reference
FusionNet: 3D Object Classification Using Multiple Data Representations
2016cited by this paper
Subcategory-Aware Convolutional Neural Networks for Object Proposals and Detection
2016cited by this paper
Vehicle Detection from 3D Lidar Using Fully Convolutional Network
2016influential reference
A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection
2016influential reference
Fast R-CNN
2015influential reference
Multi-view Convolutional Neural Networks for 3D Shape Recognition
2015cited by this paper
3D Object Proposals for Accurate Object Class Detection
2015influential reference
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
2015influential reference
What Makes for Effective Detection Proposals?
2015cited by this paper
Voting for Voting in Online Point Cloud Object Detection
2015influential reference
Joint SFM and detection cues for monocular 3D localization in road scenes
2015cited by this paper
Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images
2015influential reference
Data-driven 3D Voxel Patterns for object category recognition
2015cited by this paper
Edge Boxes: Locating Object Proposals from Edges
2014cited by this paper
Very Deep Convolutional Networks for Large-Scale Image Recognition
2014cited by this paper
Are Cars Just 3D Boxes? Jointly Estimating the 3D Shape of Multiple Objects
2014cited by this paper
Sliding Shapes for 3D Object Detection in Depth Images
2014cited by this paper
Detailed 3D Representations for Object Recognition and Modeling
2013cited by this paper
CPMC: Automatic Object Segmentation Using Constrained Parametric Min-Cuts
2012cited by this paper
Are we ready for autonomous driving? The KITTI vision benchmark suite
2012influential reference
Segmentation as selective search for object recognition
2011cited by this paper
A Multilevel Mixture-of-Experts Framework for Pedestrian Classification
2011cited by this paper
Γ and B
2004cited by this paper

CITED BY

KeyGeoFusion: A multi-modal keypoint and geometry-aware framework for small and distant 3D object detection in sparse point clouds
2026cites this paper
Robust multimodal 3D object detection: overcoming weather challenges in autonomous driving perception systems
2026cites this paper
Deep learning for object detection: state of the art, challenges, and future directions
2026cites this paper
Robot Object Detection and Tracking Based on Image–Point Cloud Instance Matching
2026cites this paper
Investigating a Unified 3-D Object Detection Method for Different Multibeam LiDAR
2026influential citation
A robust multi-sensor fusion model against adversarial patch attack
2026cites this paper
Enhanced Thermal-Only Object Detection via LoRA-Guided Thermal-to-Visible Translation and Cross-Modal Distillation
2026cites this paper
SPU-PRTI: self-supervised point cloud upsampling via progressive refinement of two-stage interpolation
2026cites this paper
SWG-Fusion: Soft Weather-Guided Multimodal Fusion with VLM-Assistance for BEV Object Detection under Harsh Weather
2026cites this paper
Multimodal fusion of 3D point cloud and intraoperative imaging to enhance surgical robot navigation
2026cites this paper
Enhanced 3D shoeprint classification via multi-scale PointNet++ with attention mechanisms
2026cites this paper
Delving Into the Secrets of BEV 3D Object Detection in Autonomous Driving: A Comprehensive Survey
2026cites this paper
3D weld seam measurement in automotive manufacturing based on structured light reconstruction and deep learning segmentation
2026cites this paper
HFSA-Net: A 3D Object Detection Network with Structural Encoding and Attention Enhancement for LiDAR Point Clouds
2026cites this paper
STAGE: Spatial Transformation-Aware Geometric Encoding Empowered Transformer for Multimodal 2-D Object Detection
2026cites this paper
PointExplainer: Towards transparent Parkinson's disease diagnosis
2026cites this paper
BEVFormer++: Enhancing BEV fusion with normalized embedding and range attention for 3D object detection
2026cites this paper
M2I2HA: Multi-modal Object Detection Based on Intra- and Inter-Modal Hypergraph Attention
2026cites this paper
PWAVEP: Purifying Imperceptible Adversarial Perturbations in 3D Point Clouds via Spectral Graph Wavelets
2026cites this paper
Towards Sustainable Forest Monitoring: Efficient Net-Based Animal Species Identification and Intrusion Detection
2026cites this paper
R2MOAG: Robust Roadside Monocular 3D Object Detection with Adaptive Token and Ground Embedding
2026cites this paper
Contextual Range-View Projection for 3D LiDAR Point Clouds
2026cites this paper
An Empirical Study on Knowledge Transfer under Domain and Label Shifts in 3D LiDAR Point Clouds
2026cites this paper
Revisiting multi-view semi-supervised classification: a reinforcement learning perspective
2026cites this paper
3D landmark detection on human point clouds: A benchmark and a dual cascade point transformer framework
2026cites this paper
MFFMN: Multi-feature fusion Mamba enhancement network for LiDAR-based 3D object detection
2026cites this paper
A comprehensive survey on image fusion: Which approach fits which need
2026cites this paper
A dual-branch noise-resistant sparse convolution method for multi-modal 3D object detection in autonomous driving
2026cites this paper
A Lightweight Multimodal Fusion Method for Object Detection Based on Bird’s Eye View
2026cites this paper
LCF3D: A robust and real-time late-cascade fusion framework for 3D object detection in autonomous driving
2026cites this paper
CAMVA: An Extension Architecture of CNN Accelerators for Multi-View Acceleration
2026cites this paper
Object-Guided Semi-Supervised Bird’s-Eye View 3D Object Detection With 3D Box Refinement
2026cites this paper
PV-MM3D: Point-voxel parallel dual-stream framework with dual-attention region adaptive fusion for multimodal 3D object detection
2026cites this paper
MSPNet: A Multiscale Pyramid Network for Semantic Segmentation of Urban-Scale Photogrammetric Point Clouds
2026cites this paper
3D object detection based on frustum-fusion for embedded systems
2026cites this paper
High-throughput Verticillium wilt detection in cotton: A comparative study of faster R-CNN and YOLOv11
2026cites this paper
SCRTN: Enhancing Multi-modal 3D Object Detection in Complex Environments
2026cites this paper
Geometry-Insensitive RPN Prototypes for Domain Adaptive 3D Object Detection
2026cites this paper
YOLO-Rail: An Improved YOLO Model for Obstacle Detection on Railway Tracks
2026cites this paper
MonoTDF: Temporal Deep Feature Learning for Generalizable Monocular 3D Object Detection
2026cites this paper
LIP-Calib: Automatic Targetless Extrinsic Calibration of LiDAR-Camera System Based on Low Intensity Perception
2026cites this paper
Innovative approaches in image-based 3D object detection for autonomous driving: A comprehensive review
2026cites this paper
Road environment semantic segmentation based on enhanced U-Net network for camera and LiDAR fusion
2026influential citation
SEF-MAP: Subspace-Decomposed Expert Fusion for Robust Multimodal HD Map Prediction
2026cites this paper
ARD: attention-guided reweighted knowledge distillation on image-like feature representations for real-time unmanned surface vehicles detection from LiDAR point clouds
2026cites this paper
Improved blind-spot object estimation via camera–LiDAR sensor fusion with IMM‐KF incorporating error characteristics
2026cites this paper
Multi-scale local geometry feature and global context learning with Kolmogorov-Arnold representation for 3D semantic segmentation
2026cites this paper
Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision
2025influential citation
Survey on 3D Object Detection Based on Deep Learning
2025cites this paper
Unified Deep Learning for Real-Time Pedestrian Detection, Pose Estimation, and Tracking
2025cites this paper
WSSIC-Net: Weakly-Supervised Semantic Instance Completion of 3D Point Cloud Scenes
2025cites this paper
Height-Adaptive Deformable Multi-Modal Fusion for 3D Object Detection
2025cites this paper
GLRD: Global-Local Collaborative Reason and Debate with PSL for 3D Open-Vocabulary Detection
2025cites this paper
Co-Fix3D: Enhancing 3D Object Detection With Collaborative Refinement
2025cites this paper
Small Targets Detection in LIDAR Point Clouds Based on Deep Learning
2025cites this paper
Structure-Aware Correspondence Learning for Relative Pose Estimation
2025cites this paper
SaViD: Spectravista Aesthetic Vision Integration for Robust and Discerning 3D Object Detection in Challenging Environments
2025cites this paper
Instance-aware sampling and voxel-transformer encoding for single-stage 3D object detection
2025cites this paper
Enhanced frustrum multi-scale VoteNet for 3D object detection in cluttered indoor scene
2025cites this paper
Semantic segmentation of 3D point cloud for sewer defect detection using an integrated global and local deep learning network
2025cites this paper
VoxT-GNN: A 3D object detection approach from point cloud based on voxel-level transformer and graph neural network
2025cites this paper
Trans-LGS: Transformer-Based Local Graph Structure for 3D Object Detection in Autonomous Vehicles
2025cites this paper
Enhancing Embodied Object Detection with Spatial Feature Memory
2025influential citation
Utilizing Deep Learning and Object-Based Image Analysis to Search for Low-Head Dams in Indiana, USA
2025cites this paper
Pre-Training of Auto-Generated Synthetic 3D Point Cloud Segmentation for Outdoor Scenes
2025cites this paper
PAPI-Reg: Patch-to-Pixel Solution for Efficient Cross-Modal Registration between LiDAR Point Cloud and Camera Image
2025cites this paper
FrustumFusionNets: A Three-Dimensional Object Detection Network Based on Tractor Road Scene
2025cites this paper
GO-N3RDet: Geometry Optimized NeRF-enhanced 3D Object Detector
2025cites this paper
DSTR: Dual Scenes Transformer for Cross-Modal Fusion in 3D Object Detection
2025cites this paper
激光点云/图像融合的舷梯姿态测量方法
2025cites this paper
Efficient Multimodal 3D Object Detector via Instance-Level Contrastive Distillation
2025cites this paper
Advancing Object Detection: A Narrative Review of Evolving Techniques and Their Navigation Applications
2025cites this paper
PillarFocusNet for 3D object detection with perceptual diffusion and key feature understanding
2025cites this paper
Enhancing 6D Pose Estimation with Cross-modal Fusion Network and Density-peak Keypoint Localization
2025cites this paper
Gaussian Belief Propagation-Based Multiview Multiextended Target Tracking With Occlusion
2025cites this paper
State Space Model Meets Transformer: A New Paradigm for 3D Object Detection
2025cites this paper
Multi-task Learning of Classification and Generation for Set-structured Data
2025cites this paper
TinyFusionDet: Hardware-Efficient LiDAR-Camera Fusion Framework for 3D Object Detection at Edge
2025influential citation
Estimation of Object Grounding Surface Using RGB-D Cameras
2025cites this paper
A light perspective for 3D object detection
2025cites this paper
Review on Image Processing Method based on AI Large Models
2025cites this paper
Spiking Point Transformer for Point Cloud Classification
2025cites this paper
RSBEV-Mamba: 3-D BEV Sequence Modeling for Multiview Remote Sensing Scene Segmentation
2025cites this paper
A Review of Automatic Driving Target Detection Based on Camera and Millimeter Wave Radar Fusion Technology
2025cites this paper
Transformer-Based Sensor Fusion for Autonomous Vehicles: A Comprehensive Review
2025cites this paper
Explainable LiDAR 3D Point Cloud Segmentation and Clustering for Detecting Airplane-Generated Wind Turbulence
2025cites this paper
Building Lightweight 3D Indoor Models from Point Clouds with Enhanced Scene Understanding
2025cites this paper
VI-BEV: Vehicle-Infrastructure Collaborative Perception for 3-D Object Detection on Bird’s-Eye View
2025cites this paper
Multimodal 3D Object Detection Based on Sparse Interaction in Internet of Vehicles
2025cites this paper
RF-Vision: Object Characterization Using Radio Frequency Propagation in Wireless Digital Twin
2025cites this paper
Parameter-Efficient Federated Cooperative Learning for 3-D Object Detection in Autonomous Driving
2025cites this paper
Car Damage Detection Based on Multi-View Fusion and Alignment: Dataset and Method
2025cites this paper
Ev-3DOD: Pushing the Temporal Boundaries of 3D Object Detection with Event Cameras
2025cites this paper
Boosting 3D Object Detection via Self-Distilling Introspective Data
2025cites this paper
VPC-VoxelNet: multi-modal fusion 3D object detection networks based on virtual point clouds
2025cites this paper
Fourier Decomposition for Explicit Representation of 3D Point Cloud Attributes
2025cites this paper
GraphDAE-PU: Graph Denosing Auto-Encoder for Arbitrary-Scale Point Cloud Upsampling
2025cites this paper
Multi-view Subspace Classification: A Hierarchical Contrastive Approach and Low-rank Latent Representation
2025cites this paper
DriveGEN: Generalized and Robust 3D Detection in Driving via Controllable Text-to-Image Diffusion Generation
2025cites this paper
BeyondPoints: Curve Fusion and Attention-Driven Local Feature Learning for 3-D Semantic Segmentation
2025cites this paper