LI Cen, HE Guang-hui. Design of FPGA-based neural network accelerator for real-time objective detection[J]. Microelectronics & Computer, 2020, 37(7): 6-11.
Citation: LI Cen, HE Guang-hui. Design of FPGA-based neural network accelerator for real-time objective detection[J]. Microelectronics & Computer, 2020, 37(7): 6-11.

Design of FPGA-based neural network accelerator for real-time objective detection

  • Implementing object detection algorithms, such as YOLO, in FPGA requires multi-level optimization, starting from model quantization to hardware optimization. To optimize hardware latency, three techniques are used: (1) bit-width quantization and layer fusion strategies are used to minimize the computation complexity, (2) a column-based pipeline architecture with padding skip technique is introduced to reduce the start-up time of pipeline and (3) a design space exploration algorithm is proposed to balance the pipeline and improve the DSP efficiency. To demonstrate the proposed neural network accelerator architecture, YOLO with 1 280×384 input is implemented on ZC706 FPGA and achieves a 1.97× latency reduction or a 1.54× DSP efficiency improvement over traditional accelerators.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return