王一强, 陶洋. 基于双目视觉的三维目标检测算法研究[J]. 微电子学与计算机, 2022, 39(2): 19-25. DOI: 10.19304/J.ISSN1000-7180.2021.0730
引用本文: 王一强, 陶洋. 基于双目视觉的三维目标检测算法研究[J]. 微电子学与计算机, 2022, 39(2): 19-25. DOI: 10.19304/J.ISSN1000-7180.2021.0730
WANG Yiqiang, TAO Yang. Research on 3D object detection algorithm based on binocular vision[J]. Microelectronics & Computer, 2022, 39(2): 19-25. DOI: 10.19304/J.ISSN1000-7180.2021.0730
Citation: WANG Yiqiang, TAO Yang. Research on 3D object detection algorithm based on binocular vision[J]. Microelectronics & Computer, 2022, 39(2): 19-25. DOI: 10.19304/J.ISSN1000-7180.2021.0730

基于双目视觉的三维目标检测算法研究

Research on 3D object detection algorithm based on binocular vision

  • 摘要: 随着无人驾驶技术的革新与发展,三维目标检测技术进入了大众的视野,相比于传统的基于激光雷达和基于单目的三维目标检测算法,基于双目视觉的检测技术具有更高的性价比,但是其检测效果仍待提高.因此,本文提出一种基于改进立体区域卷积神经网络(Stereo Region Convolutional Neural Network,Stereo RCNN)算法的F R-CNN三维目标检测算法.本文所提算法通过在Stereo R-CNN算法的特征提取网络中加入频域通道注意力模块(Frequency Channel Attention Network,FcaNet),使模型从特征多样性的角度出发关注更多与目标相关的语义信息,减轻深层残差网络权重变化所带来的影响,提升网络的特征提取能力.与此同时,引入统一动态样本加权策略,在进行训练时合理分配多任务间的损失权重,在关注“困难”样本重要程度的同时考虑“简单”样本的贡献度,提取目标更为全面的关键特征信息.实验结果表明,改进后的F R-CNN三维目标检测算法较Stereo R-CNN算法在三维目标定位平均精度上提升了3%,在三维目标检测平均精度上提升了约2%.

     

    Abstract: With the innovation and development of unmanned driving technology, 3D object detection technology has entered our sight. Compared with traditional lidar-based and monocular-based 3D target detection algorithms, although binocular vision-based detection techniques cost effective, its detection effect still needs to be improved. Therefore, this paper proposes an F R-CNN 3D object detection algorithm based on an improved 3D region convolutional neural network algorithm (Stereo RCNN). The algorithm has added a frequency domain channel attention module (FcaNet) to the feature extraction network of the Stereo R-CNN algorithm, so that the model pays more attention to the semantic information related to the target from the perspective of feature diversity, and reduces the weight of the deep residual network. The impact of changes will enhance the feature extraction capabilities of the network. At the same time, unified dynamic sample weighting strategy is introduced, and the loss weights between multiple tasks are reasonably allocated during the training. While paying attention to the importance of "difficult" samples, it also considered the contribution of "simple" samples to extract more comprehensively key feature information of the object. Experimental results show that the improved F R-CNN 3D target detection algorithm has improved the average accuracy of 3D target positioning by 3%, compared with the Stereo-RCNN algorithm, and the average accuracy of 3D target detection has increased about 2%.

     

/

返回文章
返回