Circa: Stochastic ReLUs for Private Deep Learning

Zahra Ghodsi,Nandan Kumar Jha,Brandon Reagen,S. Garg

Published 2021 in Neural Information Processing Systems

ABSTRACT

The simultaneous rise of machine learning as a service and concerns over user privacy have increasingly motivated the need for private inference (PI). While recent work demonstrates PI is possible using cryptographic primitives, the computational overheads render it impractical. The community is largely unprepared to address these overheads, as the source of slowdown in PI stems from the ReLU operator whereas optimizations for plaintext inference focus on optimizing FLOPs. In this paper we re-think the ReLU computation and propose optimizations for PI tailored to properties of neural networks. Specifically, we reformulate ReLU as an approximate sign test and introduce a novel truncation method for the sign test that significantly reduces the cost per ReLU. These optimizations result in a specific type of stochastic ReLU. The key observation is that the stochastic fault behavior is well suited for the fault-tolerant properties of neural network inference. Thus, we provide significant savings without impacting accuracy. We collectively call the optimizations Circa and demonstrate improvements of up to 4.7x storage and 3x runtime over baseline implementations; we further show that Circa can be used on top of recent PI optimizations to obtain 1.8x additional speedup.

PUBLICATION RECORD

Publication year
2021
Venue
Neural Information Processing Systems
Publication date
2021-06-15
Fields of study
Computer Science
Identifiers
arXiv 2106.08475
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

CryptGPU: Fast Privacy-Preserving Machine Learning on the GPU
2021cited by this paper
DeepReDuce: ReLU Reduction for Fast Private Inference
2021influential reference
Delphi: A Cryptographic Inference System for Neural Networks
2020influential reference
CryptoNAS: Private Inference on a ReLU Budget
2020cited by this paper
Falcon: Honest-Majority Maliciously Secure Framework for Private Deep Learning
2020cited by this paper
NASS: Optimizing Secure Inference via Neural Architecture Search
2020influential reference
MaxNVM: Maximizing DNN Storage Density and Inference Efficiency with Sparse Encoding and Error Mitigation
2019cited by this paper
XONN: XNOR-based Oblivious Deep Neural Network Inference
2019cited by this paper
MASR: A Modular Accelerator for Sparse RNNs
2019cited by this paper
ThUnderVolt: Enabling Aggressive Voltage Underscaling and Timing Error Resilience for Energy Efficient Deep Learning Accelerators
2018cited by this paper
Gazelle: A Low Latency Framework for Secure Neural Network Inference
2018influential reference
Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware
2018cited by this paper
Ares: A framework for quantifying the resilience of deep neural networks
2018cited by this paper
Faster CryptoNets: Leveraging Sparsity for Real-World Encrypted Inference
2018cited by this paper
Weightless: Lossy Weight Encoding For Deep Neural Network Compression
2017cited by this paper
SecureML: A System for Scalable Privacy-Preserving Machine Learning
2017cited by this paper
Manual for Using Homomorphic Encryption for Bioinformatics
2017cited by this paper
Oblivious Neural Network Predictions via MiniONN Transformations
2017influential reference
CryptoNets: Applying Neural Networks to Encrypted Data with High Throughput and Accuracy
2016influential reference
Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators
2016cited by this paper
Deep Residual Learning for Image Recognition
2015influential reference
Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding
2015cited by this paper
Tiny ImageNet Classiﬁcation with Convolutional Neural Networks
2015cited by this paper
Very Deep Convolutional Networks for Large-Scale Image Recognition
2014cited by this paper
Regularization of Neural Networks using DropConnect
2013cited by this paper
Fair exchange with a semi-trusted third party (extended abstract)
1997cited by this paper
Precomputing Oblivious Transfer
1995influential reference
A public key cryptosystem and a signature scheme based on discrete logarithms
1984cited by this paper

CITED BY

LRD-MPC: Efficient MPC Inference through Low-rank Decomposition
2026cites this paper
Privacy-Preserving Machine Learning Based on Cryptography: A Survey
2025cites this paper
Efficient Single-Server Private Inference Outsourcing for Convolutional Neural Networks
2025cites this paper
ABLE: Optimizing Mixed Arithmetic and Boolean Garbled Circuit
2025cites this paper
Efficient and performant Transformer private inference with heterogeneous attention mechanisms
2025cites this paper
DeepShare: Sharing ReLU Across Channels and Layers for Efficient Private Inference
2025cites this paper
Communication-Efficient Secure Three-Party Neural Network Inference with dropped server
2025cites this paper
Cuot: Accelerating Oblivious Transfer on Gpus for Privacy-Preserving Computation
2025cites this paper
Bi-CryptoNets: Leveraging Different-Level Privacy for Encrypted Inference
2024cites this paper
Secure Multiparty Generative AI
2024cites this paper
Privacy-preserving inference resistant to model extraction attacks
2024cites this paper
Fregata: Fast Private Inference With Unified Secure Two-Party Protocols
2024cites this paper
TruncFormer: Private LLM Inference Using Only Truncations
2024cites this paper
ReLU's Revival: On the Entropic Overload in Normalization-Free Large Language Models
2024cites this paper
Flash: A Hybrid Private Inference Protocol for Deep CNNs with High Accuracy and Low Latency on CPU
2024cites this paper
PriViT: Vision Transformers for Fast Private Inference
2023influential citation
Learning to Linearize Deep Neural Networks for Secure and Efficient Private Inference
2023influential citation
AutoFHE: Automated Adaption of CNNs for Efficient Evaluation over FHE
2023cites this paper
FastSecNet: An Efficient Cryptographic Framework for Private Neural Network Inference
2023cites this paper
Securing Neural Networks with Knapsack Optimization
2023influential citation
DeepReShape: Redesigning Neural Networks for Efficient Private Inference
2023cites this paper
C2PI: An Efficient Crypto-Clear Two-Party Neural Network Private Inference
2023cites this paper
Towards Fast and Scalable Private Inference
2023cites this paper
SoK: Cryptographic Neural-Network Computation
2023cites this paper
CompactTag: Minimizing Computation Overheads in Actively-Secure MPC for Deep Neural Networks
2023cites this paper
SAL-ViT: Towards Latency Efficient Private Inference on ViT using Selective Attention Search with a Learnable Softmax Approximation
2023cites this paper
On the Gini-impurity Preservation For Privacy Random Forests
2023cites this paper
AESPA: Accuracy Preserving Low-degree Polynomial Activation for Fast Private Inference
2022influential citation
Selective Network Linearization for Efficient Private Inference
2022cites this paper
MPC-Pipe: an Efficient Pipeline Scheme for Semi-honest MPC Machine Learning
2022cites this paper
Iron: Private Inference on Transformers
2022cites this paper
MPC-Pipe: an Efficient Pipeline Scheme for Secure Multi-party Machine Learning Inference
2022cites this paper
Characterizing and Optimizing End-to-End Systems for Private Inference
2022cites this paper
SIMC 2.0: Improved Secure ML Inference Against Malicious Clients
2022cites this paper
Characterization of MPC-based Private Inference for Transformer-based Models
2022cites this paper
Tabula: Efficiently Computing Nonlinear Activation Functions for Secure Neural Network Inference
2022influential citation
Stochastic Perturbations of Tabular Features for Non-Deterministic Inference with Automunge
2022cites this paper
CryptoNite: Revealing the Pitfalls of End-to-End Private Inference at Scale
2021cites this paper
Characterizing and Improving MPC-based Private Inference for Transformer-based Models
2021cites this paper