郭谦, 贺光辉. 基于FPGA的卷积神经网络硬件加速器设计空间探索研究[J]. 微电子学与计算机, 2020, 37(8): 66-71.
引用本文: 郭谦, 贺光辉. 基于FPGA的卷积神经网络硬件加速器设计空间探索研究[J]. 微电子学与计算机, 2020, 37(8): 66-71.
GUO Qian, HE Guang-hui. Research on design space exploration of FPGA-based convolutional neural network hardware accelerator[J]. Microelectronics & Computer, 2020, 37(8): 66-71.
Citation: GUO Qian, HE Guang-hui. Research on design space exploration of FPGA-based convolutional neural network hardware accelerator[J]. Microelectronics & Computer, 2020, 37(8): 66-71.

基于FPGA的卷积神经网络硬件加速器设计空间探索研究

Research on design space exploration of FPGA-based convolutional neural network hardware accelerator

  • 摘要: 为了解决基于FPGA的卷积神经网络硬件加速器资源分配的问题,提出一种基于细粒度流水线架构的设计空间探索方法.为了提高吞吐率,该方法主要使用了三种技术:1)通过对DSP进行多阶段分配,实现各级流水线平衡;2)利用可调节的中间值缓存,协调BRAM和DDR带宽资源;3)利用深度可分解卷积替换部分卷积层,减少网络整体计算量.为了验证提出的设计空间探索方法,在ZC-706FPGA上实现了YOLO2-tiny网络,结果表明与同类设计相比,本设计的吞吐率与能效比高,整体延时低.

     

    Abstract: In order to solve the problem of allocating resourcesforFPGA-based convolutional neural network hardware accelerator, a design space exploration method based on fine-grained pipeline architecture is proposed. In order to improve throughput, the method mainly uses three technologies: 1) achieve multi-stage allocation of DSP to balance the pipeline; 2) introduce adjustable intermediate result buffer to coordinate BRAM and DDR bandwidth resources; 3) utilize the depth decomposable convolution to replace part of the convolutional layers to reduce the overall computation of the network. In order to verify the proposed design space exploration method, the YOLO2-tiny network is implemented on the ZC-706 FPGA. Compared with the similar design, the results show thatthethroughput and energy efficiencyofthedesign are relatively higher, and the overall latency is shorter.

     

/

返回文章
返回