Robust Fusion of Color and Depth Data for RGB-D Target Tracking Using Adaptive Range-Invariant Depth Models and Spatio-Temporal Consistency Constraints

Jingjing Xiao,R. Stolkin,Yuqing Gao,A. Leonardis

Published 2018 in IEEE Transactions on Cybernetics

ABSTRACT

This paper presents a novel robust method for single target tracking in RGB-D images, and also contributes a substantial new benchmark dataset for evaluating RGB-D trackers. While a target object’s color distribution is reasonably motion-invariant, this is not true for the target’s depth distribution, which continually varies as the target moves relative to the camera. It is therefore nontrivial to design target models which can fully exploit (potentially very rich) depth information for target tracking. For this reason, much of the previous RGB-D literature relies on color information for tracking, while exploiting depth information only for occlusion reasoning. In contrast, we propose an adaptive range-invariant target depth model, and show how both depth and color information can be fully and adaptively fused during the search for the target in each new RGB-D image. We introduce a new, hierarchical, two-layered target model (comprising local and global models) which uses spatio-temporal consistency constraints to achieve stable and robust on-the-fly target relearning. In the global layer, multiple features, derived from both color and depth data, are adaptively fused to find a candidate target region. In ambiguous frames, where one or more features disagree, this global candidate region is further decomposed into smaller local candidate regions for matching to local-layer models of small target parts. We also note that conventional use of depth data, for occlusion reasoning, can easily trigger false occlusion detections when the target moves rapidly toward the camera. To overcome this problem, we show how combining target information with contextual information enables the target’s depth constraint to be relaxed. Our adaptively relaxed depth constraints can robustly accommodate large and rapid target motion in the depth direction, while still enabling the use of depth data for highly accurate reasoning about occlusions. For evaluation, we introduce a new RGB-D benchmark dataset with per-frame annotated attributes and extensive bias analysis. Our tracker is evaluated using two different state-of-the-art methodologies, VOT and object tracking benchmark, and in both cases it significantly outperforms four other state-of-the-art RGB-D trackers from the literature.

PUBLICATION RECORD

Publication year
2018
Venue
IEEE Transactions on Cybernetics
Publication date
2018-08-01
Fields of study
Medicine, Computer Science, Engineering
Identifiers
DOI 10.1109/tcyb.2017.2740952 PMID 28885166
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar, PubMed

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Continuously Adaptive Data Fusion and Model Relearning for Particle Filter Tracking With Multiple Features
2016cited by this paper
DS-KCF: a real-time tracker for RGB-D data
2016influential reference
An occlusion-aware particle filter tracker to handle complex and persistent occlusions
2016cited by this paper
NUS-PRO: A New Visual Tracking Challenge
2016cited by this paper
Robust Visual Tracking via Exclusive Context Modeling
2016cited by this paper
Perceptually Motivated Image Features Using Contours
2016cited by this paper
3D Part-Based Sparse Tracker with Automatic Synchronization and Registration
2016cited by this paper
Distractor-Supported Single Target Tracking in Extremely Cluttered Scenes
2016cited by this paper
3D object tracking via image sets and depth-based occlusion detection
2015cited by this paper
Single target tracking using adaptive clustered decision trees and dynamic multi-level appearance models
2015influential reference
A Novel Performance Evaluation Methodology for Single-Target Trackers
2015cited by this paper
Visual Object Tracking Performance Measures Revisited
2015cited by this paper
Video Tracking Using Learned Hierarchical Features
2015cited by this paper
Structural Sparse Tracking
2015cited by this paper
Real-time RGB-D Tracking with Depth Scaling Kernelised Correlation Filters and Occlusion Handling
2015influential reference
Online adaptive hidden Markov model for multi-tracker fusion
2015cited by this paper
In defense of color-based model-free tracking
2015cited by this paper
Single and Multiple Object Tracking Using a Multi-Feature Joint Sparse Representation
2015cited by this paper
Exploring Causal Relationships in Visual Object Tracking
2015cited by this paper
Object Tracking Benchmark
2015influential reference
Online learning 3D context for robust visual tracking
2015cited by this paper
Robust depth-based object tracking from a moving binocular camera
2015cited by this paper
Adaptive Color Attributes for Real-Time Visual Tracking
2014cited by this paper
Object Tracking by Oversampling Local Features
2014cited by this paper
Particle Filter Tracking of Camouflaged Targets by Adaptive Fusion of Thermal and Visible Spectra Camera Data
2014cited by this paper
Visual Tracking: An Experimental Survey.
2014cited by this paper
Unifying Spatial and Attribute Selection for Distracter-Resilient Tracking
2014cited by this paper
UvA-DARE ( Digital Academic Repository ) Visual Tracking : An Experimental Survey Smeulders
2013cited by this paper
Online Object Tracking: A Benchmark
2013influential reference
Robust Visual Tracking Using an Adaptive Coupled-Layer Visual Model
2013cited by this paper
Highly Nonrigid Object Tracking via Patch-Based Dynamic Appearance Modeling
2013cited by this paper
Tracking Revisited Using RGBD Camera: Unified Benchmark and Baselines
2013cited by this paper
Visual tracking via adaptive structural local sparse appearance model
2012cited by this paper
Single and Multiple Object Tracking Using Log-Euclidean Riemannian Subspace and Block-Division Appearance Model
2012cited by this paper
Robust object tracking via sparsity-based collaborative model
2012cited by this paper
SLIC Superpixels Compared to State-of-the-Art Superpixel Methods
2012cited by this paper
Dual-Force Metric Learning for Robust Distracter-Resistant Tracker
2012cited by this paper
Outdoor RGB-D SLAM Performance in Slow Mine Detection
2012cited by this paper
Tracking people within groups with RGB-D data
2012cited by this paper
People tracking in RGB-D data with on-line boosted target models
2011cited by this paper
Unbiased look at dataset bias
2011cited by this paper
A Two-Stage Dynamic Model for Visual Tracking
2010cited by this paper
Learning Color Names for Real-World Applications
2009cited by this paper
Robust visual tracking using ℓ1 minimization
2009cited by this paper
Robust Visual Tracking Based on Incremental Tensor Subspace Learning
2007cited by this paper
A calibration system for measuring 3D ground truth for validation and error analysis of robot vision algorithms
2006cited by this paper
An Adaptive Background Model for Camshift Tracking with a Moving Camera
2006cited by this paper
Measuring complete ground-truth data and error estimates for real video
2005cited by this paper
An adaptive color-based particle filter
2003cited by this paper
On a measure of divergence between two statistical populations defined by their probability distributions
1943cited by this paper
Ieee Transactions on Pattern Analysis and Machine Intelligence High-speed Tracking with Kernelized Correlation Filters
year unknowncited by this paper

CITED BY

Adaptive Multi-Modal Visual Tracking With Dynamic Semantic Prompts
2026cites this paper
AGVOT: Visual Object Tracking via Cooperation of Aerial and Ground Views
2026cites this paper
UBATrack: Spatio-Temporal State Space Model for General Multi-Modal Tracking
2026cites this paper
Omni Survey for Multimodality Analysis in Visual Object Tracking
2025cites this paper
Collaborating Vision, Depth, and Thermal Signals for Multi-Modal Tracking: Dataset and Algorithm
2025cites this paper
Adaptive Colour-Depth Aware Attention for RGB-D Object Tracking
2025cites this paper
RGB-D visual object tracking with transformer-based multi-modal feature fusion
2025cites this paper
A Novel Hybrid 2.5D Map Representation Method Enabling 3D Reconstruction of Semantic Objects in Expansive Indoor Environments
2025cites this paper
RGB-Sonar Tracking Benchmark and Spatial Cross-Attention Transformer Tracker
2024cites this paper
Self-supervised learning for RGB-D object tracking
2024cites this paper
Weakly-Supervised RGBD Video Object Segmentation
2024cites this paper
UniMod1K: Towards a More Universal Large-Scale Dataset and Benchmark for Multi-modal Learning
2024cites this paper
Feature enhancement and coarse-to-fine detection for RGB-D tracking
2024cites this paper
Thermal Infrared Target Tracking: A Comprehensive Review
2024cites this paper
A Survey of RGB-Depth Object Tracking
2024influential citation
UBPT: Unidirectional and Bidirectional Prompts for RGBD Tracking
2024cites this paper
Performance Evaluation of Deep Learning-based Quadrotor UAV Detection and Tracking Methods
2024cites this paper
DepthRefiner: Adapting RGB Trackers to RGBD Scenes via Depth-Fused Refinement
2024cites this paper
Temporal adaptive bidirectional bridging for RGB-D tracking
2024cites this paper
AMATrack: A Unified Network With Asymmetric Multimodal Mixed Attention for RGBD Tracking
2024cites this paper
Time-of-Flight透散射介质成像技术综述
2023cites this paper
A Universal Event-Based Plug-In Module for Visual Object Tracking in Degraded Conditions
2023cites this paper
Resource-Efficient RGBD Aerial Tracking
2023cites this paper
Reliable information system for identifying spatio-temporal continuity of kinetic deformed objects with big point cloud data
2023cites this paper
Multi-sensor based object tracking using enhanced particle swarm optimized multi-cue granular fusion
2023cites this paper
Robot-Person Tracking in Uniform Appearance Scenarios: A New Dataset and Challenges
2023cites this paper
ARKitTrack: A New Diverse Dataset for Tracking Using Mobile RGB-D Data
2023cites this paper
RGBD Object Tracking: An In-depth Review
2022influential citation
The fusion of infrared and visible images via decomposition-based structure transfer and local saliency detection
2022cites this paper
Visual Object Tracking on Multi-modal RGB-D Videos: A Review
2022influential citation
Fuzziness based semi-supervised multimodal learning for patient's activity recognition using RGBDT videos
2022cites this paper
Graph-Based Point Tracker for 3D Object Tracking in Point Clouds
2022cites this paper
RGBD1K: A Large-scale Dataset and Benchmark for RGB-D Object Tracking
2022influential citation
Towards Generic 3D Tracking in RGBD Videos: Benchmark and Baseline
2022cites this paper
Learning Dual-Fused Modality-Aware Representations for RGBD Tracking
2022influential citation
Unsupervised monocular depth estimation with aggregating image features and wavelet SSIM (Structural SIMilarity) loss
2021cites this paper
Depth-only Object Tracking
2021cites this paper
Object Tracking by Jointly Exploiting Frame and Event Domain
2021cites this paper
DepthTrack: Unveiling the Power of RGBD Tracking
2021influential citation
Multi-domain collaborative feature representation for robust visual object tracking
2021cites this paper
Depth-Aware Object Tracking With a Conditional Variational Autoencoder
2021cites this paper
Attribute filter based infrared and visible image fusion
2021cites this paper
DAL: A Deep Depth-Aware Long-term Tracker
2021influential citation
The Eighth Visual Object Tracking VOT2020 Challenge Results
2020cites this paper
A Cuboid CNN Model With an Attention Mechanism for Skeleton-Based Action Recognition
2020cites this paper
Multi-modal visual tracking: Review and experimental comparison
2020cites this paper
Robust RGB-D tracking via compact CNN features
2020cites this paper
Recent trends in multicue based visual tracking: A review
2020cites this paper
Robust fusion for RGB-D tracking using CNN features
2020cites this paper
Infrared and visible image fusion via detail preserving adversarial learning
2020cites this paper
GreenSea: Visual Soccer Analysis Using Broad Learning System
2020cites this paper
A novel approach for multi-cue feature fusion for robust object tracking
2020cites this paper
BiFNet: Bidirectional Fusion Network for Road Segmentation
2020cites this paper
Hierarchical multi-modal fusion FCN with attention model for RGB-D tracking
2019cites this paper
Target Tracking Control Based on Dual Model Fusion
2019influential citation
DAL - A Deep Depth-aware Long-term Tracker
2019influential citation
The Seventh Visual Object Tracking VOT2019 Challenge Results
2019influential citation
Recycling lithium-ion batteries from electric vehicles
2019cites this paper
View Invariant Human Action Recognition Using 3D Geometric Features
2019cites this paper
Target-Aware Correlation Filter Tracking in RGBD Videos
2019cites this paper
CDTB: A Color and Depth Visual Object Tracking Dataset and Benchmark
2019influential citation
Visual Object Tracking in RGB-D Data via Genetic Feature Learning
2019cites this paper
[The research on cardiac volume-time relationship based on retrospective electrocardiograph four-dimension computer tomography data collection and structured sparse algorithm].
2018cites this paper
Object Tracking by Reconstruction With View-Specific Discriminative Correlation Filters
2018influential citation
MS3D: Mean-Shift Object Tracking Boosted by Joint Back Projection of Color and Depth
2018cites this paper
Multimodal Deep Feature Fusion (MMDFF) for RGB-D Tracking
2018cites this paper