Scalable high-performance architecture for convolutional ternary neural networks on FPGA

Adrien Prost-Boucle,A. Bourge,F. Pétrot,Hande Alemdar,Nicholas Caldwell,V. Leroy

Published 2017 in International Conference on Field-Programmable Logic and Applications

ABSTRACT

Thanks to their excellent performances on typical artificial intelligence problems, deep neural networks have drawn a lot of interest lately. However, this comes at the cost of large computational needs and high power consumption. Benefiting from high precision at acceptable hardware cost on these difficult problems is a challenge. To address it, we advocate the use of ternary neural networks (TNN) that, when properly trained, can reach results close to the state of the art using floatingpoint arithmetic. We present a highly versatile FPGA friendly architecture for TNN in which we can vary both the number of bits of the input data and the level of parallelism at synthesis time, allowing to trade throughput for hardware resources and power consumption. To demonstrate the efficiency of our proposal, we implement high-complexity convolutional neural networks on the Xilinx Virtex-7 VC709 FPGA board. While reaching a better accuracy than comparable designs, we can target either high throughput or low power. We measure a throughput up to 27 000 fps at ≈7W or up to 8.36 TMAC/s at ≈13 W.

PUBLICATION RECORD

Publication year
2017
Venue
International Conference on Field-Programmable Logic and Applications
Publication date
2017-09-01
Fields of study
Computer Science, Engineering
Identifiers
DOI 10.23919/FPL.2017.8056850
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs
2017cited by this paper
LookNN: Neural network with no multiplication
2017cited by this paper
Ternary neural networks for resource-efficient AI applications
2016cited by this paper
EIE: Efficient Inference Engine on Compressed Deep Neural Network
2016cited by this paper
Bitwise Neural Networks
2016cited by this paper
FINN: A Framework for Fast, Scalable Binarized Neural Network Inference
2016cited by this paper
YodaNN: An Ultra-Low Power Convolutional Neural Network Accelerator Based on Binary Weights
2016cited by this paper
XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks
2016cited by this paper
BinaryConnect: Training Deep Neural Networks with binary weights during propagations
2015influential reference
RIFFA 2.1
2015cited by this paper
Backpropagation for Energy-Efficient Neuromorphic Computing
2015cited by this paper
Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition
2012influential reference
Reading Digits in Natural Images with Unsupervised Feature Learning
2011influential reference
Implementation framework for Artificial Neural Networks on FPGA
2011cited by this paper
The rebirth of neural networks
2010cited by this paper
Hardware accelerated convolutional neural networks for synthetic vision systems
2010cited by this paper
Learning Multiple Layers of Features from Tiny Images
2009influential reference
A VLSI implementation of multi-layer neural network with ternary activation functions and limited integer weights
1995cited by this paper
Experimental Determination of Precision Requirements for Back-propagation Training of Artiicial Neural Networks
1991cited by this paper
Learning algorithms for neural networks with ternary weights
1988cited by this paper

CITED BY

TeTRA-VPR: A Ternary Transformer Approach for Compact Visual Place Recognition
2025cites this paper
Review of neural network model acceleration techniques based on FPGA platforms
2024cites this paper
ACNNE: An Adaptive Convolution Engine for CNNs Acceleration Exploiting Partial Reconfiguration on FPGAs
2024cites this paper
An Automated Design Flow for Adaptive Neural Network Hardware Accelerators
2023cites this paper
Efficient FPGA-Based Convolutional Neural Network Implementation for Edge Computing
2023cites this paper
Reg-Tune: A Regression-Focused Fine-Tuning Approach for Profiling Low Energy Consumption and Latency
2023influential citation
Accurate Low-Bit Length Floating-Point Arithmetic with Sorting Numbers
2023cites this paper
Ternary Quantization: A Survey
2023cites this paper
Fast matrix multiplication for binary and ternary CNNs on ARM CPU
2022influential citation
Towards An FPGA-targeted Hardware/Software Co-design Framework for CNN-based Edge Computing
2022cites this paper
An FPGA-based Solution for Convolution Operation Acceleration
2022cites this paper
TAS: Ternarized Neural Architecture Search for Resource-Constrained Edge Devices
2022cites this paper
Low-precision logarithmic arithmetic for neural network accelerators
2022cites this paper
Acceleration of Deep Neural Network Training Using Field Programmable Gate Arrays
2022cites this paper
A Binarized Neural Network Approach to Accelerate in-Vehicle Network Intrusion Detection
2022cites this paper
A Hiding External Memory Access Latency by Prefetch Architecture for DNN Accelerator Available on High-Level Synthesis
2021cites this paper
Pruning and Quantization for Deep Neural Network Acceleration: A Survey
2021cites this paper
TileNET: Hardware accelerator for ternary Convolutional Neural Networks
2021cites this paper
Chapter Five - FPGA based neural network accelerators
2021cites this paper
Low-Latency In Situ Image Analytics With FPGA-Based Quantized Convolutional Neural Network
2021cites this paper
Ternary Hashing
2021cites this paper
An Energy Efficient EdgeAI Autoencoder Accelerator for Reinforcement Learning
2021influential citation
A F AST M ETHOD TO F INE - TUNE N EURAL N ETWORKS FOR THE L EAST E NERGY C ONSUMPTION ON FPGA S
2021cites this paper
APIR-DSP: An approximate PIR-DSP architecture for error-tolerant applications
2021cites this paper
A Survey on the Optimization of Neural Network Accelerators for Micro-AI On-Device Inference
2021cites this paper
QS-NAS: Optimally Quantized Scaled Architecture Search to Enable Efficient On-Device Micro-AI
2021influential citation
3U-EdgeAI: Ultra-Low Memory Training, Ultra-Low Bitwidth Quantization, and Ultra-Low Latency Acceleration
2021cites this paper
Implementation of Ternary Weights With Resistive RAM Using a Single Sense Operation Per Synapse
2020cites this paper
An Accelerated Edge Cloud System for Energy Data Stream Processing Based on Adaptive Incremental Deep Learning Scheme
2020cites this paper
XpulpNN: Enabling Energy Efficient and Flexible Inference of Quantized Neural Networks on RISC-V Based IoT End Nodes
2020cites this paper
Finding Storage- and Compute-Efficient Convolutional Neural Networks
2020cites this paper
Low Precision Floating Point Arithmetic for High Performance FPGA-based CNN Acceleration
2020cites this paper
Phoenix: A Low-Precision Floating-Point Quantization Oriented Architecture for Convolutional Neural Networks
2020cites this paper
Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey
2020cites this paper
Optimizing Temporal Convolutional Network Inference on FPGA-Based Accelerators
2020cites this paper
High-Throughput Convolutional Neural Network on an FPGA by Customized JPEG Compression
2020cites this paper
High performance reconfigurable computing for numerical simulation and deep learning
2020cites this paper
On building a CNN-based multi-view smart camera for real-time object detection
2020cites this paper
T-DLA: An Open-source Deep Learning Accelerator for Ternarized DNN Models on Embedded FPGA
2019cites this paper
A Survey of FPGA-Based Neural Network Inference Accelerator
2019cites this paper
CodeX: Bit-Flexible Encoding for Streaming-based FPGA Acceleration of DNNs
2019cites this paper
Deep Neural Network Approximation for Custom Hardware
2019cites this paper
[DL] A Survey of FPGA-based Neural Network Inference Accelerators
2019cites this paper
Power Efficient Object Detector with an Event-Driven Camera for Moving Object Surveillance on an FPGA
2019cites this paper
PIR-DSP: An FPGA DSP Block Architecture for Multi-precision Deep Neural Networks
2019cites this paper
Jet Features: Hardware-Friendly, Learned Convolutional Kernels for High-Speed Image Classification
2019influential citation
Noise Convolutional Neural Networks and FPGA Implementation
2019cites this paper
Efficient Decompression of Binary Encoded Balanced Ternary Sequences
2019cites this paper
PULP-NN: accelerating quantized neural networks on parallel ultra-low-power RISC-V processors
2019cites this paper
Unrolling Ternary Neural Networks
2019influential citation
A Camera That CNNs: Towards Embedded Neural Networks on Pixel Processor Arrays
2019cites this paper
Zac: Towards Automatic Optimization and Deployment of Quantized Deep Neural Networks on Embedded Devices
2019cites this paper
Cascade^CNN: Pushing the Performance Limits of Quantisation in Convolutional Neural Networks
2018cites this paper
FINN-R
2018influential citation
Inference of quantized neural networks on heterogeneous all-programmable devices
2018cites this paper
Overview of the state of the art in embedded machine learning
2018influential citation
A Survey of FPGA-Based Neural Network Inference Accelerator 11 : 3
2018cites this paper
Toolflows for Mapping Convolutional Neural Networks on FPGAs
2018cites this paper
A Lightweight YOLOv2: A Binarized CNN with A Parallel Support Vector Regression for an FPGA
2018cites this paper
Enhancing FPGA Architecture for Efficient Deep Learning Inference
2018cites this paper
Reconfigurable hardware acceleration of CNNs on FPGA-based smart cameras
2018cites this paper
You Cannot Improve What You Do not Measure
2018cites this paper
High-Efficiency Convolutional Ternary Neural Networks with Custom Adder Trees and Weight Compression
2018influential citation
A survey of FPGA-based accelerators for convolutional neural networks
2018cites this paper
BinaryEye: A 20 kfps Streaming Camera System on FPGA with Real-Time On-Device Image Recognition Using Binary Neural Networks
2018cites this paper
Accelerating CNN inference on FPGAs: A Survey
2018influential citation
NEURAghe
2017influential citation
A Survey of FPGA-Based Neural Network Accelerator
2017cites this paper
A Survey of FPGA Based Neural Network Accelerator
2017cites this paper
Streaming Architecture for Large-Scale Quantized Neural Networks on an FPGA-Based Dataflow Platform
2017cites this paper
HW/SW Architectures and CAD software for multiprocessor systems on chip Specification, modeling, simulation and implementation of embedded systems on chip Reconfigurable and prototyping
2017cites this paper