Learning both Weights and Connections for Efficient Neural Network

Published 2015 in Neural Information Processing Systems

ABSTRACT

Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on embedded systems. Also, conventional networks fix the architecture before training starts; as a result, training cannot improve the architecture. To address these limitations, we describe a method to reduce the storage and computation required by neural networks by an order of magnitude without affecting their accuracy by learning only the important connections. Our method prunes redundant connections using a three-step method. First, we train the network to learn which connections are important. Next, we prune the unimportant connections. Finally, we retrain the network to fine tune the weights of the remaining connections. On the ImageNet dataset, our method reduced the number of parameters of AlexNet by a factor of 9x, from 61 million to 6.7 million, without incurring accuracy loss. Similar experiments with VGG-16 found that the number of parameters can be reduced by 13x, from 138 million to 10.3 million, again with no loss of accuracy.

PUBLICATION RECORD

Publication year
2015
Venue
Neural Information Processing Systems
Publication date
2015-06-08
Fields of study
Computer Science
Identifiers
arXiv 1506.02626
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Compressing Neural Networks with the Hashing Trick
2015cited by this paper
Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding
2015cited by this paper
Data-free Parameter Pruning for Deep Neural Networks
2015cited by this paper
Going deeper with convolutions
2014cited by this paper
How transferable are features in deep neural networks?
2014cited by this paper
Caffe: Convolutional Architecture for Fast Feature Embedding
2014cited by this paper
DeepFace: Closing the Gap to Human-Level Performance in Face Verification
2014cited by this paper
Compressing Deep Convolutional Networks using Vector Quantization
2014cited by this paper
Dropout: a simple way to prevent neural networks from overfitting
2014cited by this paper
Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation
2014cited by this paper
Deep Fried Convnets
2014cited by this paper
Very Deep Convolutional Networks for Large-Scale Image Recognition
2014influential reference
Memory Bounded Deep Convolutional Networks
2014cited by this paper
Regularization of Neural Networks using DropConnect
2013cited by this paper
Predicting Parameters in Deep Learning
2013cited by this paper
Peter Huttenlocher (1931–2013)
2013cited by this paper
Deep learning with COTS HPC systems
2013cited by this paper
Network In Network
2013cited by this paper
ImageNet classification with deep convolutional neural networks
2012cited by this paper
Natural Language Processing (Almost) from Scratch
2011cited by this paper
Improving the speed of neural networks on CPUs
2011cited by this paper
Hash Kernels for Structured Data
2009cited by this paper
Feature hashing for large scale multitask learning
2009cited by this paper
2005 Special Issue: Framewise phoneme classification with bidirectional LSTM and other neural network architectures
2005cited by this paper
Gradient-based learning applied to document recognition
1998cited by this paper
Learning long-term dependencies with gradient descent is difficult
1994cited by this paper
Second Order Derivatives for Network Pruning: Optimal Brain Surgeon
1992cited by this paper
Improving Generalization of Neural Networks Through Pruning
1991cited by this paper
Optimal Brain Damage
1989cited by this paper
Comparing Biases for Minimal Network Construction with Back-Propagation
1988cited by this paper
Neuronal mechanisms of developmental plasticity in the cat's visual system.
1984cited by this paper
Neuronal mechanisms of developmental plasticity in the cat's visual system.
1984cited by this paper

CITED BY

Bi-directional distillation enabling progressive lightweight model training in edge computing-based federated learning
2026cites this paper
Iterative multi-criteria filter pruning for efficient convolutional neural network deployment
2026cites this paper
EdgeLDR: Quaternion Low-Displacement Rank Neural Networks for Edge-Efficient Deep Learning
2026cites this paper
LambNet-T: A lightweight path-conditional transformer autoencoder for temperature-aware baseline learning in Lamb-wave SHM.
2026cites this paper
PLATE: Plasticity-Tunable Efficient Adapters for Geometry-Aware Continual Learning
2026cites this paper
Adaptive Structured Pruning of Convolutional Neural Networks for Time Series Classification
2026cites this paper
Compression of fully-connected layers based on parameter sharing for large-scale category image classification
2026cites this paper
Neural-Enhanced Modulation for Spatial Selective Transmission on Low-End IoT Devices
2026cites this paper
CNN Compression via Channel-Wise Variance-Based Filter Pruning
2026cites this paper
Theory of Minimal Weight Perturbations in Deep Networks and its Applications for Low-Rank Activated Backdoor Attacks
2026cites this paper
Compressed BC-LISTA via Low-Rank Convolutional Decomposition
2026cites this paper
QuantLRM: Quantization of Large Reasoning Models via Fine-Tuning Signals
2026cites this paper
Multicamera multitarget tracking based on lightweight neural network
2026cites this paper
A YOLO-Driven Adaptive Focusing Network for Multi-Source Overlapping Modulation Perception
2026cites this paper
Deconfounding enhanced image-based knowledge distillation for fault diagnosis with an application in manufacturing process
2026cites this paper
Tracing carbon flow to unravel carbon lock-in in China through a supernetwork-based perspective for targeted decarbonization
2026cites this paper
An Expert-Assistant Network With Temporal Shuffling for Efficient Automatic Modulation Recognition
2026cites this paper
Sparsity-Aware Streaming SNN Accelerator with Output-Channel Dataflow for Automatic Modulation Classification
2026cites this paper
Double Strike: Breaking Approximation-Based Side-Channel Countermeasures for DNNs
2026cites this paper
From Algorithm to Medicine: AI in the Discovery and Development of New Drugs
2026cites this paper
SSPFormer: Self-Supervised Pretrained Transformer for MRI Images
2026cites this paper
Analyzing Neural Network Information Flow Using Differential Geometry
2026influential citation
Routing the Lottery: Adaptive Subnetworks for Heterogeneous Data
2026cites this paper
Learnable Permutation for Structured Sparsity on Transformer Models
2026cites this paper
Post-Training Probability Manifold Correction via Structured SVD Pruning and Self-Referential Distillation
2026cites this paper
ViT-Stain: Vision transformer-driven virtual staining for skin histopathology via global contextual learning
2026cites this paper
Pruning at Initialisation through the lens of Graphon Limit: Convergence, Expressivity, and Generalisation
2026influential citation
A cross-layer bidirectional measurement pruning for model compression guided by feature propagation
2026cites this paper
Efficient Post-Training Pruning of Large Language Models with Statistical Correction
2026influential citation
QExpander: Pruning-sensitivity-guided adaptive widening for binary neural networks
2026cites this paper
Oiso: Outlier-Isolated Data Format for Low-Bit Large Language Model Quantization
2026cites this paper
Prune-Based Deep Reinforcement Learning Offloading Algorithm for Mobile Edge Computing
2026cites this paper
A novel enhanced neural network for anomaly detection in the IoT environment
2026cites this paper
Progressive calibrated feature association network for imbalanced class incremental fault diagnosis in bogie mechanical transmission systems
2026cites this paper
Predicting Underwater Acoustic Channel Transmission Loss of a Conjugate Source Depth
2026cites this paper
FlexNS: Flexible Neuron Selection for Multitask Transfer Learning in AIoT
2026cites this paper
A Server-Side Model Intellectual Property Protection Method for Federated Learning Against Model Theft
2026cites this paper
Sparse Knowledge Distillation: A Mathematical Framework for Probability-Domain Temperature Scaling and Multi-Stage Compression
2026cites this paper
Beyond Variance: Knowledge-Aware LLM Compression via Fisher-Aligned Subspace Diagnostics
2026influential citation
Balancing privacy and performance: An empirical study of machine unlearning in deep learning models
2026cites this paper
LLMs can Compress LLMs: Adaptive Pruning by Agents
2026cites this paper
Learning to Decode in Parallel: Self-Coordinating Neural Network for Real-Time Quantum Error Correction
2026cites this paper
Onboard Deployment of Remote Sensing Foundation Models: A Comprehensive Review of Architecture, Optimization, and Hardware
2026cites this paper
AgenticPruner: MAC-Constrained Neural Network Compression via LLM-Driven Strategy Search
2026cites this paper
Ordered Local Momentum for Asynchronous Distributed Learning under Arbitrary Delays
2026cites this paper
EdgeNet: Empowering Edge Device Performance by Leveraging Clustered Networks for Optimal Memory Management With Platform-Awareness
2026cites this paper
Split-on-Share: Mixture of Sparse Experts for Task-Agnostic Continual Learning
2026cites this paper
CompSRT: Quantization and Pruning for Image Super Resolution Transformers
2026cites this paper
A Pre-impact Fall Algorithm Based on a Lightweight Re-Parameters-Parallel Convolutional-TCN
2026influential citation
Memristor Synapse—A Device-Level Critical Review
2026cites this paper
SurrogateSHAP: Training-Free Contributor Attribution for Text-to-Image (T2I) Models
2026cites this paper
AIRE-Prune: Asymptotic Impulse-Response Energy for State Pruning in State Space Models
2026cites this paper
The Gradient-Causal Gap: Why Gradient Importance Fails on Complex Tasks
2026cites this paper
Energy-Efficient Neuromorphic Computing for Edge AI: A Framework with Adaptive Spiking Neural Networks and Hardware-Aware Optimization
2026cites this paper
scLong: a billion-parameter foundation model for capturing long-range gene context in single-cell transcriptomics.
2026cites this paper
POP: Online Structural Pruning Enables Efficient Inference of Large Foundation Models
2026cites this paper
BPE-MTSNet: Balanced Performance and Efficiency in Multivariate Spatiotemporal Time Series Forecasting
2026cites this paper
SPEED: Structured kernel block pruning with filter groups for efficient and elastic SW-HW co-design in FPGA-based CNN accelerators
2026cites this paper
Dense Neural Networks are not Universal Approximators
2026cites this paper
Pareto-guided Pipeline for Distilling Featherweight AI Agents in Mobile MOBA Games
2026cites this paper
ROCKET: Rapid Optimization via Calibration-guided Knapsack Enhanced Truncation for Efficient Model Compression
2026cites this paper
Enhancing Predictability of Multi-Tenant DNN Inference for Autonomous Vehicles'Perception
2026cites this paper
A Decade of Video Analytics at Edge: Training, Deployment, Orchestration, and Platforms
2026cites this paper
Dual-Task Learning for Real-Time Semantic Segmentation in Autonomous Driving
2026cites this paper
Look-Up Table-Based Energy-Efficient Architecture for Neural Accelerators (LANA)
2026cites this paper
Core unlearning: A multi-modal gradient-efficient architecture for exact and approximate model rewriting
2026cites this paper
Free energy of neural network can predict accuracy after pruning
2026cites this paper
USformer: A U-Shaped Structure Transformer for RGB-Thermal Semantic Segmentation and Traffic Scene Understanding
2026cites this paper
Discriminative response pruning for robust and efficient deep networks under label noise
2026cites this paper
A singular learning theory for unified large language model pruning
2026cites this paper
Energy-Aware Pruning for Federated Deep Reinforcement Learning in Multi-Interface IoT Networks
2026cites this paper
A survey of lightweight methods for object detection networks
2026cites this paper
Mitigating Server-Side Communication Bottlenecks in Distributed Learning With Round-Robin Participant Coordination
2026cites this paper
Regularisation in neural networks: a survey and empirical analysis of approaches
2026cites this paper
Efficient column-wise N:M pruning on RISC-V CPU
2026cites this paper
CB-DistillGrad: Class-Balanced Distillation and Gradient Conflict Resolution for Low Power Edge AI Applications
2026cites this paper
Beyond IoT: AGI as a Transformative Solution for the Internet of Everything and Relationship Explosion
2026cites this paper
T3C: Test-Time Tensor Compression with Consistency Guarantees
2026influential citation
Multi-bit watermarking for deep models via clean trigger samples and pairwise class-difference vectors
2026cites this paper
Constrained collaborative optimization of charged particle tracking with multi-agent reinforcement learning
2026cites this paper
MI-PRUN: Optimize Large Language Model Pruning via Mutual Information
2026cites this paper
United We Defend: Collaborative Membership Inference Defenses in Federated Learning
2026cites this paper
OrientationNN: a physics-informed lightweight neural network for real-time joint kinematics estimation from IMU data
2026cites this paper
Mathematical Frameworks in Image Captioning: A Comprehensive Survey and Real-Time Processing Analysis
2026cites this paper
D2Prune: Sparsifying Large Language Models via Dual Taylor Expansion and Attention Distribution Awareness
2026cites this paper
Spectral Complex Autoencoder Pruning: A Fidelity-Guided Criterion for Extreme Structured Channel Compression
2026cites this paper
Accelerating deep neural networks through stability-aware initial training and density-guided asymptotic filter decay
2026cites this paper
Pruning Attention Heads Based on Semantic and Code Structure for Smart Contract Vulnerability Detection
2026cites this paper
Pruning as Evolution: Emergent Sparsity Through Selection Dynamics in Neural Networks
2026cites this paper
Using deep learning for predicting cleansing quality of colon capsule endoscopy images
2026influential citation
Performance and Complexity Trade-off Optimization of Speech Models During Training
2026cites this paper
Verifying Local Robustness of Pruned Safety-Critical Networks
2026cites this paper
Why Inference in Large Models Becomes Decomposable After Training
2026cites this paper
SymRefine: A symbolic regression approach for refining and compressing neural networks
2026cites this paper
Knowledge distillation with spatial semantic enhancement for remote sensing object detection
2026cites this paper
A novel feature reconstruction method for bone marrow cell classification
2026cites this paper
Edge Device–Oriented Tomato Fruit Thinning and Harvesting Model Under Adverse Weather Conditions
2026cites this paper
Towards Compact and Robust DNNs via Compression-aware Sharpness Minimization
2026influential citation
Resource-Efficient LLM Customization on Mobile Devices Through Proxy Submodel Tuning
2026cites this paper
Future of Edge AI in biodiversity monitoring
2026cites this paper