程钰清, 贺占庄, 马钟, 毕瑞星, 毛远宏. 面向嵌入式FPGA的智能目标检测算法[J]. 微电子学与计算机, 2021, 38(6): 87-92.
引用本文: 程钰清, 贺占庄, 马钟, 毕瑞星, 毛远宏. 面向嵌入式FPGA的智能目标检测算法[J]. 微电子学与计算机, 2021, 38(6): 87-92.
CHENG Yu-qing, HE Zhan-zhuang, MA Zhong, BI Rui-xing, MAO Yuan-hong. Intelligent target detection algorithm for embedded FPGA[J]. Microelectronics & Computer, 2021, 38(6): 87-92.
Citation: CHENG Yu-qing, HE Zhan-zhuang, MA Zhong, BI Rui-xing, MAO Yuan-hong. Intelligent target detection algorithm for embedded FPGA[J]. Microelectronics & Computer, 2021, 38(6): 87-92.

面向嵌入式FPGA的智能目标检测算法

Intelligent target detection algorithm for embedded FPGA

  • 摘要: 随着识别率和实时性的提高,卷积神经网络目标检测算法的计算复杂度和内存需求急剧增加,难以应用在小尺寸和低功耗的嵌入式平台上.本文在分析现有目标检测神经网络模型结构的基础上,根据FPGA高实时性、低功耗以及并行处理的特点,提出了一种在FPGA上高速运算的神经网络模型规整化方法.在此方法指导下设计改进了一款目标检测神经网络模型结构,包括删除LRN层、Scale层的融合和替换Leaky-ReLU为ReLU.通过在voc2007数据集上的对比实验验证了算法结构的有效性,在PC上其速度相比传统YOLO-V1算法提升了11.5%.在Xilinx ZCU102开发板上的仿真表明:该改进的目标检测算法速度达到29 FPS(Frames Per Second),精度达到62.3 mAP.

     

    Abstract: With the improvement of recognition rate and real-time performance, the computational complexity and memory requirements of convolutional neural network target detection algorithm increase sharply, which makes it difficult to be applied to embedded platform with small size and low power consumption. In this paper, based on the analysis of the existing neural network model structure of target detection, according to the characteristics of high real-time performance, low power consumption and parallel processing of FPGA, a neural network model normalization method based on high speed operation on FPGA is proposed. Under the guidance of this method, a target detection neural network model structure is designed and implemented, including removing the LRN layer, fusion of Scale layer and replacing Leaky-ReLU with ReLU. The effectiveness of the proposed algorithm structure is verified by comparative experiments on VOC2007 dataset. Compared with traditional YOLO-V1 algorithm, the speed of the proposed algorithm on PC is improved by 11.5%. Hardware simulation under Xilinx ZCU102 shows that the improved target detection algorithm can reach the speed of 29FPS (Frames Per Second) and the accuracy of 62.3mAP.

     

/

返回文章
返回