How transferable are features in deep neural networks?

J. Yosinski,J. Clune,Yoshua Bengio,Hod Lipson

Published 2014 in Neural Information Processing Systems

ABSTRACT

Many deep neural networks trained on natural images exhibit a curious phenomenon in common: on the first layer they learn features similar to Gabor filters and color blobs. Such first-layer features appear not to be specific to a particular dataset or task, but general in that they are applicable to many datasets and tasks. Features must eventually transition from general to specific by the last layer of the network, but this transition has not been studied extensively. In this paper we experimentally quantify the generality versus specificity of neurons in each layer of a deep convolutional neural network and report a few surprising results. Transferability is negatively affected by two distinct issues: (1) the specialization of higher layer neurons to their original task at the expense of performance on the target task, which was expected, and (2) optimization difficulties related to splitting networks between co-adapted neurons, which was not expected. In an example network trained on ImageNet, we demonstrate that either of these two issues may dominate, depending on whether features are transferred from the bottom, middle, or top of the network. We also document that the transferability of features decreases as the distance between the base task and target task increases, but that transferring features even from distant tasks can be better than using random features. A final surprising result is that initializing a network with transferred features from almost any number of layers can produce a boost to generalization that lingers even after fine-tuning to the target dataset.

PUBLICATION RECORD

Publication year
2014
Venue
Neural Information Processing Systems
Publication date
2014-11-06
Fields of study
Computer Science
Identifiers
arXiv 1411.1792
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Caffe: Convolutional Architecture for Fast Feature Embedding
2014influential reference
OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks
2013cited by this paper
DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition
2013cited by this paper
Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation
2013cited by this paper
Visualizing and Understanding Convolutional Networks
2013cited by this paper
ImageNet classification with deep convolutional neural networks
2012influential reference
Improving neural networks by preventing co-adaptation of feature detectors
2012cited by this paper
Deep Learning of Representations for Unsupervised and Transfer Learning
2011cited by this paper
Deep Learners Benefit More from Out-of-Distribution Examples
2011cited by this paper
ICA with Reconstruction Cost for Efficient Overcomplete Feature Learning
2011cited by this paper
Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations
2009cited by this paper
ImageNet: A large-scale hierarchical image database
2009influential reference
What is the best multi-stage architecture for object recognition?
2009influential reference
Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories
2004cited by this paper
Learning Many Related Tasks at the Same Time with Backpropagation
1994cited by this paper
Supplementary References
year unknowncited by this paper

CITED BY

From Generative Modeling to Clinical Classification: A GPT-Based Architecture for EHR Notes
2026cites this paper
Use of Machine Learning for Determination of Deformation Silica Sand Quartz Particles
2026cites this paper
Resolving data imbalance in transfer learning: a simple random sampling approach
2026cites this paper
Putting a Face to Forgetting: Continual Learning meets Mechanistic Interpretability
2026cites this paper
Beyond relighting: RTI for clustering fragmented heritage textiles using deep learning
2026cites this paper
Deep Learning Framework for Damage Prediction in Low-Velocity Impact
2026cites this paper
HCL Net: Deep Learning for Accurate Classification of Honeycombing Lung and Ground Glass Opacity in CT Images
2026cites this paper
RLV: LLM-based vulnerability detection by retrieving and refining contextual information
2026cites this paper
Multi-Class Brain Tumor Detection Using Transfer Learning and Interpretable Deep Models
2026cites this paper
PSRGAN-RK: Real-time infrared super-resolution based on MobileNetV4 and RKNN
2026cites this paper
Personalized Federated Learning with Hierarchical Two-Branch Aggregation for Few-Shot Scenarios
2026cites this paper
Design and Implementation of an Annotation-Driven Drone Autonomy Tool Using YOLOv8–V11 Architectures for Real-Time Object Detection and Distance Estimation
2026influential citation
A comprehensive review of convolutional neural networks: foundations, enhancements and applications
2026cites this paper
Accelerated degradation framework of lithium-ion batteries with physics-informed domain shift learning
2026cites this paper
Deep learning models for efficient geotechnical predictions: reducing training effort and data requirements with transfer learning
2026cites this paper
Transfer learning-based soybean LAI estimations by integrating PROSAIL, UAV, and PlanetScope imagery
2026cites this paper
Novel decoupling algorithm based on transfer learning for multi-axis force sensor
2026cites this paper
Local Layer-wise Differential Privacy in Federated Learning
2026cites this paper
Convex Efficient Coding
2026cites this paper
Fine-tuning ImageNet-pretrained models in medical image classification: Reassessing the impact of different factors
2026cites this paper
CSN: A compact semantic segmentation network for visual scene perception in assistive navigation
2026cites this paper
Visual Prompt-Agnostic Evolution
2026cites this paper
On the Relationship Between Representation Geometry and Generalization in Deep Neural Networks
2026cites this paper
Erase at the Core: Representation Unlearning for Machine Unlearning
2026cites this paper
SCA-Net: Spatial-Contextual Aggregation Network for Enhanced Small Building and Road Change Detection
2026cites this paper
ProbeSpec: Robust Model Fingerprinting via Dynamic Perturbation Response Spectrum
2026cites this paper
A morphologically diverse freshwater microalgae dataset for deep learning-based classification with transfer learning analysis
2026cites this paper
Dual-Modal Deep Learning with In-Domain Training and Attention for Infant Brain Myelination Prediction
2026cites this paper
A simulation-based deep learning approach to operational robot fleet sizing in robotic mobile fulfillment systems
2026cites this paper
Semantic segmentation for feature detection in ocean bottom seismometer data
2026cites this paper
A Survey on Deep Learning-Based Chinese Font Style Transfer
2026cites this paper
Toward On-Device Distributed Photovoltaic Power Forecasting: A Tiny Online Learning Framework
2026cites this paper
Mathematics of digital twins and transfer learning for systems governed by PDE models
2026cites this paper
Efficient Bayesian inversion of hydraulic parameters for unsaturated soil slopes using a fine-tuned deep operator network
2026cites this paper
Tillage-induced soil feature extraction and multi-sensors fusion for tillage system classification
2026cites this paper
Machine learning-based prediction and optimization of performances of porous ZrO2/Al2O3 ceramics: a few-shot learning approach
2026cites this paper
Reliable leukemia detection via transfer-enhanced Bayesian CNNs
2026cites this paper
Evaluating transfer learning strategies for improving dairy cattle body weight prediction in small farms using depth-image and point-cloud data
2026cites this paper
Clustering of Temporal and Visual Data: Recent Advancements
2026cites this paper
Beyond the final layer: Attentive multilayer fusion for vision transformers
2026cites this paper
A Lightweight Frozen Multi-Convolution Dual-Branch Network for Efficient sEMG-Based Gesture Recognition
2026cites this paper
Obsidian sourcing in Mesoamerica and the Isthmo-Colombian area using image-based machine learning
2026cites this paper
Consistency-Regularized GAN for Few-Shot SAR Target Recognition
2026cites this paper
MGM as a Large-Scale Pretrained Foundation Model for Microbiome Analyses in Diverse Contexts.
2026cites this paper
SiGMa-Net II: Distinguishing Binary Black Holes from Glitches
2026cites this paper
Improved Strawberry Disease Classification under Class Imbalance through In-Backbone Latent Diffusion
2026cites this paper
A hybrid physics-informed and data-driven model for estimating ocean internal wave phase speeds from remote sensing imagery
2026cites this paper
Trust Region Continual Learning as an Implicit Meta-Learner
2026cites this paper
PriorProbe: Recovering Individual-Level Priors for Personalizing Neural Networks in Facial Expression Recognition
2026cites this paper
Noisy models of the ventral stream reveal the impact of recurrence and learned representations on information processing timescales
2026cites this paper
Performance prediction of the centrifugal compressor using a long short-term memory-based model with transfer learning and similarity conversion
2026cites this paper
BinSight: Enhancing Executable Binary Classification Accuracy Through Kernel Density Estimation-Based Visualization
2026cites this paper
1%>100%: High-Efficiency Visual Adapter with Complex Linear Projection Optimization
2026cites this paper
ProtoMark: A transfer prototype network with Markov-enhanced features for few-shot intrusion detection
2026cites this paper
WSBD: Freezing-Based Optimizer for Quantum Neural Networks
2026cites this paper
Cross-household Transfer Learning Approach with LSTM-based Demand Forecasting
2026cites this paper
A temporally transferrable approach for quantifying crop residue cover: linking deep learning of UAV images with satellite-based spectral modelling
2026cites this paper
Neural-POD: A Plug-and-Play Neural Operator Framework for Infinite-Dimensional Functional Nonlinear Proper Orthogonal Decomposition
2026cites this paper
Classification of rice plant diseases using efficient DenseNet121
2026cites this paper
Free-scale quantification of biomass and fucoxanthin in mass culture of Phaeodactylum tricornutum using Raman spectroscopy coupled with machine learning and transfer learning.
2026cites this paper
Satellite-Based Detection of Looted Archaeological Sites Using Machine Learning
2026cites this paper
Predicting the Unseen: Transductive Transfer Learning in Real Estate Price Prediction
2026cites this paper
Energy-Efficient Prediction in Textile Manufacturing: Enhancing Accuracy and Data Efficiency With Ensemble Deep Transfer Learning
2026cites this paper
Classification of sea-ice concentration from ship-board S-band radar images using open-source machine learning tools
2026cites this paper
Bio-Inspired Fine-Tuning for Selective Transfer Learning in Image Classification
2026cites this paper
Feature Responsive LoRA: Toward Parameter-Efficient Transfer Learning for Self-Supervised Visual Models
2026cites this paper
Autonomous component optimization method for steel braced frame structures based on multi-agent and physics-informed deep reinforcement learning
2026cites this paper
Monthly monitoring of urban development and renewal at the block-level in China using Sentinel-2 time series
2026cites this paper
Fatigue delamination shape prognostics in composites using numerical simulation-assisted transfer learning
2026cites this paper
Deep transfer learning enhanced sparse-angle terahertz rotational coherent scattering super-resolution imaging
2026cites this paper
A novel two-stage heterogeneous transfer learning framework for the estimation of the remaining useful life of industrial components
2026cites this paper
Large mRNA language foundation modeling with NUWA for unified sequence perception and generation
2026cites this paper
Circuit-to-Graph: Power Converters Modeling With Multidimensional Generalization
2026cites this paper
Fleet-based transfer learning for anomaly detection in industrial systems
2026cites this paper
DCDLNet: A label-noise tolerant classification algorithm for polsar images based on dual-band consistency and difference
2026cites this paper
Combining datasets with different ground truths using Low-Rank Adaptation to generalize image-based CNN models for photometric redshift prediction
2026cites this paper
Zero-shot Forecasting by Simulation Alone
2026cites this paper
A Comparative Study of Custom CNNs, Pre-trained Models, and Transfer Learning Across Multiple Visual Datasets
2026influential citation
Evolving Programmatic Skill Networks
2026cites this paper
Comparative Analysis of Custom CNN Architectures versus Pre-trained Models and Transfer Learning: A Study on Five Bangladesh Datasets
2026cites this paper
Surpassing stock market's noise-nonstationarity tradeoff by causal-based domain discovery
2026cites this paper
Efficient data-driven approach for bending deformation capture of fiber-reinforced soft actuators
2026cites this paper
An inter-domain feature discrepancy method for multi-source partial domain fault diagnosis
2026cites this paper
FARR: An efficient frozen-feature learning framework for wood species identification with applications to texture recognition
2026cites this paper
Neuro-Symbolic Activation Discovery: Transferring Mathematical Structures from Physics to Ecology for Parameter-Efficient Neural Networks
2026cites this paper
Microscale-Searching Optimization for Transfer-Learning-based Filter Fine-Tuning
2026cites this paper
End-to-End Motorcycle Violation Detection with Region-Specific Automatic License Plate Recognition
2026cites this paper
SkeFi: Cross-Modal Knowledge Transfer for Wireless Skeleton-Based Action Recognition
2026cites this paper
Improving the efficiency of QAOA using efficient parameter transfer initialization and targeted-single-layer regularized optimization with minimal performance degradation
2026cites this paper
Study on Methods and a System for Real-Time Monitoring of the Remaining Useful Life of a Milling Cutter
2026cites this paper
UAV-Based Forest Fire Early Warning and Intervention Simulation System with High-Accuracy Hybrid AI Model
2026cites this paper
YOLO-DS: Fine-Grained Feature Decoupling via Dual-Statistic Synergy Operator for Object Detection
2026cites this paper
CHARMS: A CNN-Transformer Hybrid with Attention Regularization for MRI Super-Resolution
2026cites this paper
An Hour-Specific Hybrid DNN–SVR Framework for National-Scale Short-Term Load Forecasting
2026cites this paper
YUAR: A Reliable Computer Vision Method for Aircraft Docking and Push‐Back Recognition at Airport Gates
2026cites this paper
Estimating proximity to muscular failure using surface EMG and deep learning.
2026cites this paper
Perceived streetscape quality and bike lane effectiveness: a computer vision approach
2026cites this paper
HP-GAN: Harnessing pretrained networks for GAN improvement with FakeTwins and discriminator consistency.
2026cites this paper
Transfer learning-based retrieval of cloud base height from FY-4B satellite observations
2026cites this paper
Federated Incremental Prognostics for System-Specific Degradation Tracking in Spacecraft Propulsion Systems
2026cites this paper