Architectural Insights into Knowledge Distillation for Object Detection: A Comprehensive Review

Mahdi Golizadeh,Nassibeh Golizadeh,Mohammad Ali Keyvanrad,Hossein Shirazi

Published 2025 in arXiv.org

ABSTRACT

Object detection has achieved remarkable accuracy through deep learning, yet these improvements often come with increased computational cost, limiting deployment on resource-constrained devices. Knowledge Distillation (KD) provides an effective solution by enabling compact student models to learn from larger teacher models. However, adapting KD to object detection poses unique challenges due to its dual objectives-classification and localization-as well as foreground-background imbalance and multi-scale feature representation. This review introduces a novel architecture-centric taxonomy for KD methods, distinguishing between CNN-based detectors (covering backbone-level, neck-level, head-level, and RPN/RoI-level distillation) and Transformer-based detectors (including query-level, feature-level, and logit-level distillation). We further evaluate representative methods using the MS COCO and PASCAL VOC datasets with mAP@0.5 as performance metric, providing a comparative analysis of their effectiveness. The proposed taxonomy and analysis aim to clarify the evolving landscape of KD in object detection, highlight current challenges, and guide future research toward efficient and scalable detection systems.

PUBLICATION RECORD

Publication year
2025
Venue
arXiv.org
Publication date
2025-08-05
Fields of study
Computer Science
Identifiers
DOI 10.48550/arXiv.2508.03317 arXiv 2508.03317
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Enhancing DETR Efficiency with Inter-Object Relationship and Semantic Spectral Decomposition-Based Distillation
2025cited by this paper
DCA: Dividing and Conquering Amnesia in Incremental Object Detection
2025cited by this paper
MFD-KD: Multi-Scale Frequency-Driven Knowledge Distillation
2025cited by this paper
Event-Aware Distilled DETR for Object Detection in an Automotive Context
2025cited by this paper
A Review of Knowledge Distillation in Object Detection
2025cited by this paper
SO-DETR: Leveraging Dual-Domain Features and Knowledge Distillation for Small Object Detection
2025cited by this paper
CLoCKDistill: Consistent Location-and-Context-aware Knowledge Distillation for DETRs
2025influential reference
Optimizing YOLOv5s Object Detection Through Knowledge Distillation Algorithm
2024cited by this paper
Distilling Knowledge from Large-Scale Image Models for Object Detection
2024cited by this paper
OD-DETR: Online Distillation for Stabilizing Training of Detection Transformer
2024influential reference
Reliable hybrid knowledge distillation for multi-source domain adaptive object detection
2024cited by this paper
Shared Knowledge Distillation Network for Object Detection
2024cited by this paper
MSSD: multi-scale self-distillation for object detection
2024cited by this paper
Cross-Weighting Knowledge Distillation for Object Detection
2024cited by this paper
Distilling object detectors with efficient logit mimicking and mask-guided feature imitation
2024influential reference
Knowledge distillation for object detection based on Inconsistency-based Feature Imitation and Global Relation Imitation
2024cited by this paper
Cosine similarity-guided knowledge distillation for robust object detectors
2024cited by this paper
Latent Distillation for Continual Object Detection at the Edge
2024cited by this paper
Knowledge Distillation via Query Selection for Detection Transformer
2024influential reference
Teaching with Uncertainty: Unleashing the Potential of Knowledge Distillation in Object Detection
2024cited by this paper
Small Object Detection Based on Bidirectional Feature Fusion and Multi-scale Distillation
2024cited by this paper
DMKD: Improving Feature-Based Knowledge Distillation for Object Detection Via Dual Masking Augmentation
2023cited by this paper
CrossKD: Cross-Head Knowledge Distillation for Object Detection
2023cited by this paper
Dual Relation Knowledge Distillation for Object Detection
2023cited by this paper
Gradient-Guided Knowledge Distillation for Object Detectors
2023cited by this paper
When Object Detection Meets Knowledge Distillation: A Survey
2023cited by this paper
Closed-loop unified knowledge distillation for dense object detection
2023cited by this paper
Exploring the Knowledge Transferred by Response-Based Teacher-Student Distillation
2023cited by this paper
Distilling Efficient Vision Transformers from CNNs for Semantic Segmentation
2023cited by this paper
Heterogeneous Generative Knowledge Distillation with Masked Image Modeling
2023cited by this paper
Structured Knowledge Distillation for Accurate and Efficient Object Detection
2023cited by this paper
Structural Knowledge Distillation for Object Detection
2022cited by this paper
DETRDistill: A Universal Knowledge Distillation Framework for DETR-families
2022influential reference
Distilling Object Detectors With Global Knowledge
2022cited by this paper
IDa-Det: An Information Discrepancy-aware Distillation for 1-bit Detectors
2022cited by this paper
HEAD: HEtero-Assists Distillation for Heterogeneous Object Detectors
2022cited by this paper
PKD: General Distillation Framework for Object Detectors via Pearson Correlation Coefficient
2022influential reference
Balanced knowledge distillation for one-stage object detector
2022cited by this paper
KD-DETR: Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling
2022influential reference
Group channel pruning and spatial attention distilling for object detection
2022cited by this paper
Decoupled Knowledge Distillation
2022cited by this paper
Multi-level knowledge distillation for low-resolution object detection and facial expression recognition
2022cited by this paper
A Light-Weight CNN for Object Detection with Sparse Model and Knowledge Distillation
2022cited by this paper
Localization Distillation for Dense Object Detection
2021cited by this paper
Transformers in Vision: A Survey
2021cited by this paper
General Instance Distillation for Object Detection
2021influential reference
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
2021cited by this paper
Distilling Object Detectors via Decoupled Features
2021cited by this paper
Distilling Knowledge via Knowledge Review
2021influential reference
Emerging Properties in Self-Supervised Vision Transformers
2021cited by this paper
Improve Object Detection with Feature-based Knowledge Distillation: Towards Accurate and Efficient Detectors
2021influential reference
Revisiting knowledge distillation for light-weight visual object detection
2021cited by this paper
G-DetKD: Towards General Distillation Framework for Object Detectors via Contrastive and Semantic-guided Feature Imitation
2021cited by this paper
Improving Object Detection by Label Assignment Distillation
2021cited by this paper
Knowledge Distillation for Object Detection Based on Mutual Information
2021cited by this paper
Research on Object Detection Network Based on Knowledge Distillation
2021cited by this paper
One-stage object detection knowledge distillation via adversarial learning
2021cited by this paper
Deep Structured Instance Graph for Distilling Object Detectors
2021cited by this paper
Distilling Object Detectors with Feature Richness
2021influential reference
Focal and Global Knowledge Distillation for Detectors
2021cited by this paper
Knowledge Distillation for Object Detection via Rank Mimicking and Prediction-guided Feature Imitation
2021cited by this paper
A Review of Yolo Algorithm Developments
2021cited by this paper
End-to-End Object Detection with Transformers
2020cited by this paper
KNOWLEDGE DISTILLATION USING GANS FOR FAST OBJECT DETECTION
2020cited by this paper
Distilling Knowledge by Mimicking Features
2020cited by this paper
Mobile Centernet for Embedded Deep Learning Object Detection
2020cited by this paper
Probabilistic Anchor Assignment with IoU Prediction for Object Detection
2020cited by this paper
Distilling Object Detectors with Task Adaptive Regularization
2020cited by this paper
DetNAS: Backbone Search for Object Detection
2019cited by this paper
Localization-aware Channel Pruning for Object Detection
2019cited by this paper
Mask Guided Knowledge Distillation for Single Shot Detector
2019cited by this paper
GAN-Knowledge Distillation for One-Stage Object Detection
2019influential reference
Distilling Object Detectors With Fine-Grained Feature Imitation
2019cited by this paper
Fully Quantized Network for Object Detection
2019cited by this paper
Path Aggregation Network for Instance Segmentation
2018cited by this paper
Object detection at 200 Frames Per Second
2018cited by this paper
Coarse-to-Fine Salient Object Detection with Low-Rank Matrix Recovery
2018cited by this paper
Mimicking Very Efficient Network for Object Detection
2017cited by this paper
Focal Loss for Dense Object Detection
2017cited by this paper
Learning Efficient Object Detection Models with Knowledge Distillation
2017cited by this paper
Feature Pyramid Networks for Object Detection
2016cited by this paper
The Cityscapes Dataset for Semantic Urban Scene Understanding
2016cited by this paper
Distilling the Knowledge in a Neural Network
2015cited by this paper
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
2015cited by this paper
Deep Residual Learning for Image Recognition
2015influential reference
SSD: Single Shot MultiBox Detector
2015cited by this paper
Microsoft COCO: Common Objects in Context
2014cited by this paper
FitNets: Hints for Thin Deep Nets
2014cited by this paper
Are we ready for autonomous driving? The KITTI vision benchmark suite
2012cited by this paper
Model compression
2006cited by this paper
Histograms of oriented gradients for human detection
2005cited by this paper

CITED BY

No citing papers are available for this paper.