Stochastic Activation Pruning for Robust Adversarial Defense

Guneet Singh Dhillon,K. Azizzadenesheli,Zachary Chase Lipton,Jeremy Bernstein,Jean Kossaifi,Aran Khanna,Anima Anandkumar

Published 2018 in International Conference on Learning Representations

ABSTRACT

Neural networks are known to be vulnerable to adversarial examples. Carefully chosen perturbations to real images, while imperceptible to humans, induce misclassification and threaten the reliability of deep learning systems in the wild. To guard against adversarial examples, we take inspiration from game theory and cast the problem as a minimax zero-sum game between the adversary and the model. In general, for such games, the optimal strategy for both players requires a stochastic policy, also known as a mixed strategy. In this light, we propose Stochastic Activation Pruning (SAP), a mixed strategy for adversarial defense. SAP prunes a random subset of activations (preferentially pruning those with smaller magnitude) and scales up the survivors to compensate. We can apply SAP to pretrained networks, including adversarially trained models, without fine-tuning, providing robustness against adversarial examples. Experiments demonstrate that SAP confers robustness against attacks, increasing accuracy and preserving calibration.

PUBLICATION RECORD

Publication year
2018
Venue
International Conference on Learning Representations
Publication date
2018-02-15
Fields of study
Mathematics, Computer Science
Identifiers
arXiv 1803.01442
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

signSGD: compressed optimisation for non-convex problems
2018cited by this paper
Biologically inspired protection of deep networks from adversarial attacks
2017cited by this paper
The Space of Transferable Adversarial Examples
2017cited by this paper
Adversarial Attacks on Neural Network Policies
2017cited by this paper
On Calibration of Modern Neural Networks
2017cited by this paper
Vulnerability of Deep Reinforcement Learning to Policy Induction Attacks
2017cited by this paper
Delving into adversarial attacks on deep policies
2017cited by this paper
Towards Deep Learning Models Resistant to Adversarial Attacks
2017influential reference
Ensemble Adversarial Training: Attacks and Defenses
2017cited by this paper
Adversarial examples in the physical world
2016influential reference
Practical Black-Box Attacks against Machine Learning
2016cited by this paper
On the Effectiveness of Defensive Distillation
2016cited by this paper
Deep Residual Learning for Image Recognition
2015cited by this paper
MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems
2015cited by this paper
Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding
2015cited by this paper
Analysis of classifiers’ robustness to adversarial perturbations
2015cited by this paper
Human-level control through deep reinforcement learning
2015cited by this paper
Dropout: a simple way to prevent neural networks from overfitting
2014cited by this paper
Explaining and Harnessing Adversarial Examples
2014influential reference
Intriguing properties of neural networks
2013cited by this paper
The Arcade Learning Environment: An Evaluation Platform for General Agents
2012cited by this paper
Learning Multiple Layers of Features from Tiny Images
2009cited by this paper
A Course in Game Theory
1995cited by this paper

CITED BY

A Comprehensive Review: The Evolving Cat-and-Mouse Game in Network Intrusion Detection Systems Leveraging Machine Learning
2026cites this paper
FGAA: Enhancing adversarial robustness in AIoT-enabled smart systems via Fine-Grained Activation Alignment
2026cites this paper
Multi-Scale Shapley Adaptation Pruning: Realizing Backdoor Defense in Brain-Computer Interface With Shapley-Value-Based Neural Network Pruning
2026cites this paper
Towards Compact and Robust DNNs via Compression-aware Sharpness Minimization
2026cites this paper
Riemannian Dueling Optimization
2026cites this paper
Secure Communications, Sensing, and Computing Towards Next-Generation Networks
2026cites this paper
Defenses Against Evasion Attacks in the Eyes of Automotive Industry: Review From a Practical Perspective
2025cites this paper
Pruning Strategies for Backdoor Defense in LLMs
2025cites this paper
Two is Better than One: Efficient Ensemble Defense for Robust and Compact Models
2025cites this paper
Studying Various Activation Functions and Non-IID Data for Machine Learning Model Robustness
2025cites this paper
Adversarial Image Purification by Explaining Adversarial Detectors
2025cites this paper
A comprehensive survey of adversarial defense techniques in the visual domain
2025cites this paper
Visualisation of cyber vulnerabilities in maritime human-autonomy teaming technology
2025cites this paper
TopoReformer: Mitigating Adversarial Attacks Using Topological Purification in OCR Models
2025cites this paper
A Comparative Analysis of Adversarial Attacks using Machine Learning Techniques
2025cites this paper
Test-Time Defense Against Adversarial Attacks via Stochastic Resonance of Latent Ensembles
2025cites this paper
Efficient Adversarial Malware Defense via Trust-Based Raw Override and Confidence-Adaptive Bit-Depth Reduction
2025cites this paper
Universal Properties of Activation Sparsity in Modern Large Language Models
2025cites this paper
DeepDefense: Layer-Wise Gradient-Feature Alignment for Building Robust Neural Networks
2025cites this paper
Zero-Shot Robustness of Vision Language Models Via Confidence-Aware Weighting
2025cites this paper
Lattice Climber Attack: Adversarial attacks for randomized mixtures of classifiers
2025cites this paper
Imperceptible Backdoor Attacks on Text-Guided 3D Scene Grounding
2025cites this paper
AutoAdvExBench: Benchmarking autonomous exploitation of adversarial example defenses
2025cites this paper
TND: Two-stage non-invasive defense of intrusion detection system from adversarial attack
2025cites this paper
Unveiling the Role of Randomization in Multiclass Adversarial Classification: Insights from Graph Theory
2025cites this paper
SMiLE: Provably Enforcing Global Relational Properties in Neural Networks
2025cites this paper
Evidence-Based Multi-Feature Fusion for Adversarial Robustness
2025cites this paper
Dynamic Layer Routing Defense for Real-Time Embedded Vision
2025cites this paper
Neuroplasticity in Artificial Intelligence - An Overview and Inspirations on Drop In & Out Learning
2025cites this paper
Defense against Adversarial Attacks in Image Recognition Based on Multilayer Filters
2024cites this paper
Proactive Schemes: A Survey of Adversarial Attacks for Social Good
2024cites this paper
Iterative Window Mean Filter: Thwarting Diffusion-Based Adversarial Purification
2024cites this paper
Adversarial Defense Based on Denoising Convolutional Autoencoder in EEG-Based Brain–Computer Interfaces
2024cites this paper
Privacy-Preserving Universal Adversarial Defense for Black-Box Models
2024cites this paper
Adversarial attacks and defenses for large language models (LLMs): methods, frameworks & challenges
2024cites this paper
Beyond Dropout: Robust Convolutional Neural Networks Based on Local Feature Masking
2024cites this paper
ProFeAT: Projected Feature Adversarial Training for Self-Supervised Learning of Robust Representations
2024cites this paper
Calibration Attacks: A Comprehensive Study of Adversarial Attacks on Model Confidence
2024cites this paper
MC-Net: Realistic Sample Generation for Black-Box Attacks
2024cites this paper
Knowledge distillation via Noisy Feature Reconstruction
2024cites this paper
A new zhoneypot defence for deep neural networks
2024cites this paper
Towards Gradient-Based Saliency Consensus Training for Adversarial Robustness
2024cites this paper
A Survey of Neural Network Robustness Assessment in Image Recognition
2024cites this paper
Enhancing Adversarial Robustness for High-Speed Train Bogie Fault Diagnosis Based on Adversarial Training and Residual Perturbation Inversion
2024cites this paper
An AI blue team playbook
2024cites this paper
Optimal Parameter and Neuron Pruning for Out-of-Distribution Detection
2024cites this paper
Enhance DNN Adversarial Robustness and Efficiency via Injecting Noise to Non-Essential Neurons
2024cites this paper
CAEN: efficient adversarial robustness with categorized ensemble of networks
2024cites this paper
MaskDroid: Robust Android Malware Detection with Masked Graph Representations
2024cites this paper
A Security-Oriented Overview of Federated Learning Utilizing Layered Reference Model
2024cites this paper
Accelerated Smoothing: A Scalable Approach to Randomized Smoothing
2024cites this paper
A Multi-Task Adversarial Attack against Face Authentication
2024cites this paper
Certifying Global Robustness for Deep Neural Networks
2024cites this paper
Trustworthy machine learning in the context of security and privacy
2024cites this paper
Physical Backdoor: Towards Temperature-Based Backdoor Attacks in the Physical World
2024cites this paper
Enhancing Model Robustness and Accuracy Against Adversarial Attacks via Adversarial Input Training
2024cites this paper
Trans-IFFT-FGSM: a novel fast gradient sign method for adversarial attacks
2024cites this paper
Standard-Deviation-Inspired Regularization for Improving Adversarial Robustness
2024cites this paper
DiffBreak: Is Diffusion-Based Purification Robust?
2024cites this paper
Sliced Wasserstein adversarial training for improving adversarial robustness
2024cites this paper
Video Corpus Moment Retrieval via Deformable Multigranularity Feature Fusion and Adversarial Training
2024cites this paper
First line of defense: A robust first layer mitigates adversarial attacks
2024cites this paper
SMLE: Safe Machine Learning via Embedded Overapproximation
2024cites this paper
FlexLLM: Exploring LLM Customization for Moving Target Defense on Black-Box LLMs Against Jailbreak Attacks
2024cites this paper
Backdoor Attacks on Bimodal Salient Object Detection with RGB-Thermal Data
2024cites this paper
Solving Differential Equations with Constrained Learning
2024cites this paper
Boosting adversarial robustness via feature refinement, suppression, and alignment
2024cites this paper
Provable Robustness for Streaming Models with a Sliding Window
2023cites this paper
Improving the Transferability of Adversarial Examples via Direction Tuning
2023cites this paper
Beyond Empirical Risk Minimization: Local Structure Preserving Regularization for Improving Adversarial Robustness
2023cites this paper
Reducing classifier overconfidence against adversaries through graph algorithms
2023cites this paper
Aux-Drop: Handling Haphazard Inputs in Online Learning Using Auxiliary Dropouts
2023cites this paper
Between-Class Adversarial Training for Improving Adversarial Robustness of Image Classification
2023cites this paper
Feature Separation and Recalibration for Adversarial Robustness
2023cites this paper
Consistent Valid Physically-Realizable Adversarial Attack Against Crowd-Flow Prediction Models
2023cites this paper
Adversarial examples: attacks and defences on medical deep learning systems
2023cites this paper
On the Robustness of Randomized Ensembles to Adversarial Perturbations
2023cites this paper
Exploring the Effect of Randomness on Transferability of Adversarial Samples Against Deep Neural Networks
2023cites this paper
Randomization for adversarial robustness: the Good, the Bad and the Ugly
2023cites this paper
A Survey on Learning to Reject
2023cites this paper
Robustness Analysis of Discrete State-Based Reinforcement Learning Models in Traffic Signal Control
2023cites this paper
Rethinking the Entropy of Instance in Adversarial Training
2023cites this paper
ESSENCE: Exploiting Structured Stochastic Gradient Pruning for Endurance-Aware ReRAM-Based In-Memory Training Systems
2023cites this paper
Co(ve)rtex: ML Models as storage channels and their (mis-)applications
2023cites this paper
xNIDS: Explaining Deep Learning-based Network Intrusion Detection Systems for Active Intrusion Responses
2023cites this paper
Adversarial Machine Learning for Network Intrusion Detection Systems: A Comprehensive Survey
2023cites this paper
On the Relationship Between Universal Adversarial Attacks and Sparse Representations
2023cites this paper
Interpreting Universal Adversarial Example Attacks on Image Classification Models
2023cites this paper
Towards Augmentation Based Defense Strategies Against Adversarial Attacks
2023cites this paper
Parameter-constrained adversarial training
2023cites this paper
Topology-Preserving Adversarial Training
2023cites this paper
Improving the Robustness of Transformer-based Large Language Models with Dynamic Attention
2023cites this paper
Variational Adversarial Defense: A Bayes Perspective for Adversarial Training
2023cites this paper
Type-I Generative Adversarial Attack
2023cites this paper
Adversarial Purification of Information Masking
2023cites this paper
An Introduction to Adversarially Robust Deep Learning
2023cites this paper
Improving Robustness via Tilted Exponential Layer: A Communication-Theoretic Perspective
2023influential citation
Certifiable Black-Box Attacks with Randomized Adversarial Examples: Breaking Defenses with Provable Confidence
2023cites this paper
A comprehensive survey of robust deep learning in computer vision
2023cites this paper
Fast Propagation Is Better: Accelerating Single-Step Adversarial Training via Sampling Subnetworks
2023cites this paper