XIE Shuai, JIANG Li, YE Yaoyao. Multidimensional parallel FPGA accelerator design for real-time object detection[J]. Microelectronics & Computer, 2021, 38(8): 13-19.
Citation: XIE Shuai, JIANG Li, YE Yaoyao. Multidimensional parallel FPGA accelerator design for real-time object detection[J]. Microelectronics & Computer, 2021, 38(8): 13-19.

Multidimensional parallel FPGA accelerator design for real-time object detection

  • The YOLOv3-tiny network performs well in both accuracy and real-time for object detection. However, its complex network structure makes practical applications require targeted optimization from both software and hardware aspects. In order to meet the real-time requirements, three optimization techniques are used comprehensively. At the software level, the amount of computation is reduced through the fusion of batch normalization layer, while the low bit width to increase resource utilization.The multi-dimensional parallel FPGA computation cores are designed to match multiple convolutional layers to improve the overall throughput. Fine-grained inter-layer flow and pingpong buffer design to reduce the data transfer time. With the ZCU104 model FPGA, it achieves a detection latency of 21ms for 418 x 418 images, which exceeds similar accelerator designs and improves the DSP efficiency by 2.86 times or 8.81 times.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return