Mitigating Adversarial Attacks using Pruning

Published 2023 in International Conference on Contemporary Computing

ABSTRACT

The advent of deep learning has revolutionized the technology industry and has made Deep Neural Networks (DNNs) the powerhouse of many modern day software applications. Well-trained DNNs are able to perform complex tasks such as speech recognition, object detection and image classification with high precision and accuracy. However, the task of training such complex networks at times requires enormous amount of computational resources for which the task is often outsourced to third parties. Recent work suggests that outsourcing the training task can act as a favourable gateway for a malicious trainer to induce a backdoor in the model, which when triggered can force the model into performing in a predefined way which has been set up by the malicious trainer. This paper starts by giving an overview on how such attacks are induced and consequently discusses and provides experimental proof on the various strategies which can be used to neutralise such attacks. We use the l1 and l2 norm to identify weights which are susceptible to be poisoned and prune them away by setting their value to zero. We further inspect the efficiency of layer wise and global pruning. We infer from our experiments that fine-tuning the model for a few epochs after the fine-pruning stage has been completed helps the model to regain its lost accuracy and provides better test time accuracy. During this study, we understand that performing fine-pruning in the later layers is more effective. By pruning the last layer along with fine-tuning the model after fine-pruning has been completed, we achieve 99.96% and 86.05% accuracy for the clean validation and test dataset respectively and consequently witness a drop in attach success rate from 99 to 0 and near 0 in some cases.

PUBLICATION RECORD

Publication year
2023
Venue
International Conference on Contemporary Computing
Publication date
2023-08-03
Fields of study
Computer Science
Identifiers
DOI 10.1145/3607947.3608057
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Adversarial Clean Label Backdoor Attacks and Defenses on Text Classification Systems
2023cited by this paper
Just Rotate it: Deploying Backdoor Attacks via Rotation Transformation
2022cited by this paper
FIBA: Frequency-Injection based Backdoor Attack in Medical Image Analysis
2021cited by this paper
Adversarial Attacks and Defenses in Deep Learning
2020cited by this paper
RANDOM MASK: Towards Robust Convolutional Neural Networks
2020cited by this paper
A comprehensive survey on model compression and acceleration
2020cited by this paper
Poisoned classifiers are not only backdoored, they are fundamentally broken
2020cited by this paper
Pruning neural networks without any data by iteratively conserving synaptic flow
2020cited by this paper
STRIP: a defence against trojan attacks on deep neural networks
2019cited by this paper
Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks
2019cited by this paper
A New Backdoor Attack in CNNS by Training Set Corruption Without Label Poisoning
2019cited by this paper
Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples
2018cited by this paper
Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks
2018influential reference
Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network
2018cited by this paper
Stochastic Activation Pruning for Robust Adversarial Defense
2018cited by this paper
Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning
2017cited by this paper
Towards Robust Neural Networks via Random Self-ensemble
2017cited by this paper
MagNet: A Two-Pronged Defense against Adversarial Examples
2017cited by this paper
Adversarial Example Defenses: Ensembles of Weak Defenses are not Strong
2017cited by this paper
Defensive Distillation is Not Robust to Adversarial Examples
2016cited by this paper
Deep Residual Learning for Image Recognition
2015cited by this paper
The Limitations of Deep Learning in Adversarial Settings
2015cited by this paper
Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks
2015cited by this paper
Deep Learning Face Representation from Predicting 10,000 Classes
2014cited by this paper
ImageNet Large Scale Visual Recognition Challenge
2014cited by this paper
Multilingual Distributed Representations without Word Alignment
2013cited by this paper
Speech recognition with deep recurrent neural networks
2013cited by this paper
Face recognition in unconstrained videos with matched background similarity
2011cited by this paper

CITED BY

PRJack: Pruning-Resistant Model Hijacking Attack Against Deep Learning Models
2024cites this paper