FPGA Implementations of 3D-SIMD Processor Architecture for Deep Neural Networks Using Relative Indexed Compressed Sparse Filter Encoding Format and Stacked Filters Stationary Flow

Yuechao Gao; Nianhong Liu; Sheng Zhang

doi:10.2991/amcce-18.2018.48

<Previous Article In Volume

Next Article In Volume>

FPGA Implementations of 3D-SIMD Processor Architecture for Deep Neural Networks Using Relative Indexed Compressed Sparse Filter Encoding Format and Stacked Filters Stationary Flow

Authors

Yuechao Gao, Nianhong Liu, Sheng Zhang

Corresponding Author

Yuechao Gao

Available Online May 2018.

DOI: 10.2991/amcce-18.2018.48 How to use a DOI?
Keywords: 3D-SIMD Processor Architecture, Encoding Format, Stationary Flow, Efficiency
Abstract: It is a challenging task to deploy computationally and memory intensive State-of-the-art deep neural networks (DNNs) on embedded systems with limited hardware resources and power budgets. Recently developed techniques like Deep Compression make it possible to fit large DNNs, such as AlexNet and VGGNet, fully in on-chip SRAM. But sparse networks compressed using existing encoding formats, like CSR or CSC, complex the computation at runtime due to their irregular memory access characteristics. In [1], we introduce a computation dataflow, stacked filters stationary dataflow (SFS), and a corresponding data encoding format, relative indexed compressed sparse filter format (CSF), to make the best of data sparsity, and simplify data handling at execution time. In this paper we present FPGA implementations of these methods. We implement several compact streaming fully connected (FC) and Convolutional (CONV) neural network processors to show their efficiency. Comparing with the state-of-the-art results [2,3,4], our methods achieve at least 2× improvement for computation efficiency per PE on most layers. Especially, our methods achieve 8× improvement on AlexNet layer CONV4 with 384 filters, and 11× improvement on VGG16 layer CONV5-3 with 512 filters.
Copyright: © 2018, the Authors. Published by Atlantis Press.
Open Access: This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

<Previous Article In Volume

Next Article In Volume>

Volume Title: Proceedings of the 2018 3rd International Conference on Automation, Mechanical Control and Computational Engineering (AMCCE 2018)
Series: Advances in Engineering Research
Publication Date: May 2018
ISBN: 978-94-6252-508-5
ISSN: 2352-5401
DOI: 10.2991/amcce-18.2018.48 How to use a DOI?
Open Access: This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

ris enw bib

TY  - CONF
AU  - Yuechao Gao
AU  - Nianhong Liu
AU  - Sheng Zhang
PY  - 2018/05
DA  - 2018/05
TI  - FPGA Implementations of 3D-SIMD Processor Architecture for Deep Neural Networks Using Relative Indexed Compressed Sparse Filter Encoding Format and Stacked Filters Stationary Flow
BT  - Proceedings of the 2018 3rd International Conference on Automation, Mechanical Control and Computational Engineering (AMCCE 2018)
PB  - Atlantis Press
SP  - 275
EP  - 282
SN  - 2352-5401
UR  - https://doi.org/10.2991/amcce-18.2018.48
DO  - 10.2991/amcce-18.2018.48
ID  - Gao2018/05
ER  -

download .riscopy to clipboard