This paper addresses the problem of selecting appearance features for multiple object tracking (MOT) in urban scenes. Over the years, a large number of features has been used for MOT. However, it is not clear whether some of them are better than others. Commonly used features are color histograms, histograms of oriented gradients, deep features from convolutional neural networks and re-identification (ReID) features. In this study, we assess how good these features are at discriminating objects enclosed by a bounding box in urban scene tracking scenarios. Several affinity measures, namely the L1, L2 and the Bhattacharyya distances, Rank-1 counts and the cosine similarity, are also assessed for their impact on the discriminative power of the features. Results on several datasets show that features from ReID networks are the best for discriminating instances from one another regardless of the quality of the detector. If a ReID model is not available, color histograms may be selected if the detector has a good recall and there are few occlusions; otherwise, deep features are more robust to detectors with lower recall.
An Empirical Analysis of Visual Features for Multiple Object Tracking in Urban Scenes
Mehdi Miah,J. Pepin,N. Saunier,Guillaume-Alexandre Bilodeau
Published 2020 in International Conference on Pattern Recognition
ABSTRACT
PUBLICATION RECORD
- Publication year
2020
- Venue
International Conference on Pattern Recognition
- Publication date
2020-10-15
- Fields of study
Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-44 of 44 references · Page 1 of 1
CITED BY
Showing 1-5 of 5 citing papers · Page 1 of 1