The advent of deep learning has revolutionized the technology industry and has made Deep Neural Networks (DNNs) the powerhouse of many modern day software applications. Well-trained DNNs are able to perform complex tasks such as speech recognition, object detection and image classification with high precision and accuracy. However, the task of training such complex networks at times requires enormous amount of computational resources for which the task is often outsourced to third parties. Recent work suggests that outsourcing the training task can act as a favourable gateway for a malicious trainer to induce a backdoor in the model, which when triggered can force the model into performing in a predefined way which has been set up by the malicious trainer. This paper starts by giving an overview on how such attacks are induced and consequently discusses and provides experimental proof on the various strategies which can be used to neutralise such attacks. We use the l1 and l2 norm to identify weights which are susceptible to be poisoned and prune them away by setting their value to zero. We further inspect the efficiency of layer wise and global pruning. We infer from our experiments that fine-tuning the model for a few epochs after the fine-pruning stage has been completed helps the model to regain its lost accuracy and provides better test time accuracy. During this study, we understand that performing fine-pruning in the later layers is more effective. By pruning the last layer along with fine-tuning the model after fine-pruning has been completed, we achieve 99.96% and 86.05% accuracy for the clean validation and test dataset respectively and consequently witness a drop in attach success rate from 99 to 0 and near 0 in some cases.
Mitigating Adversarial Attacks using Pruning
V. Mishra,Aditya Varshney,Shekhar Yadav
Published 2023 in International Conference on Contemporary Computing
ABSTRACT
PUBLICATION RECORD
- Publication year
2023
- Venue
International Conference on Contemporary Computing
- Publication date
2023-08-03
- Fields of study
Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-28 of 28 references · Page 1 of 1
CITED BY
Showing 1-1 of 1 citing papers · Page 1 of 1