Stochastic Activation Pruning for Robust Adversarial Defense

Guneet Singh Dhillon,K. Azizzadenesheli,Zachary Chase Lipton,Jeremy Bernstein,Jean Kossaifi,Aran Khanna,Anima Anandkumar

Published 2018 in International Conference on Learning Representations

ABSTRACT

Neural networks are known to be vulnerable to adversarial examples. Carefully chosen perturbations to real images, while imperceptible to humans, induce misclassification and threaten the reliability of deep learning systems in the wild. To guard against adversarial examples, we take inspiration from game theory and cast the problem as a minimax zero-sum game between the adversary and the model. In general, for such games, the optimal strategy for both players requires a stochastic policy, also known as a mixed strategy. In this light, we propose Stochastic Activation Pruning (SAP), a mixed strategy for adversarial defense. SAP prunes a random subset of activations (preferentially pruning those with smaller magnitude) and scales up the survivors to compensate. We can apply SAP to pretrained networks, including adversarially trained models, without fine-tuning, providing robustness against adversarial examples. Experiments demonstrate that SAP confers robustness against attacks, increasing accuracy and preserving calibration.

PUBLICATION RECORD

  • Publication year

    2018

  • Venue

    International Conference on Learning Representations

  • Publication date

    2018-02-15

  • Fields of study

    Mathematics, Computer Science

  • Identifiers
  • External record

    Open on Semantic Scholar

  • Source metadata

    Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-23 of 23 references · Page 1 of 1

CITED BY

Showing 1-100 of 577 citing papers · Page 1 of 6