马学森, 储昭坤, 马吉. 递归特征融合与并行缩放的航拍工程车辆检测[J]. 微电子学与计算机, 2022, 39(8): 39-46. DOI: 10.19304/J.ISSN1000-7180.2022.0109
引用本文: 马学森, 储昭坤, 马吉. 递归特征融合与并行缩放的航拍工程车辆检测[J]. 微电子学与计算机, 2022, 39(8): 39-46. DOI: 10.19304/J.ISSN1000-7180.2022.0109
MA Xuesen, CHU Zhaokun, MA Ji. Engineering vehicle detection in aerial images with recursive feature fusion and parallel scaling[J]. Microelectronics & Computer, 2022, 39(8): 39-46. DOI: 10.19304/J.ISSN1000-7180.2022.0109
Citation: MA Xuesen, CHU Zhaokun, MA Ji. Engineering vehicle detection in aerial images with recursive feature fusion and parallel scaling[J]. Microelectronics & Computer, 2022, 39(8): 39-46. DOI: 10.19304/J.ISSN1000-7180.2022.0109

递归特征融合与并行缩放的航拍工程车辆检测

Engineering vehicle detection in aerial images with recursive feature fusion and parallel scaling

  • 摘要: 针对无人机航拍输电走廊图像中背景复杂多变、目标偏小且尺度变化大导致检测精度差的问题,本文提出基于RetinaNet递归特征融合与并行缩放的工程车辆检测方法.该方法更适合检测复杂背景中的工程车辆:首先,增添C2层为基础层,与原始骨干网输出层共同用于生成特征金字塔,避免小目标特征被高度压缩;其次,调整原始特征金字塔层次结构,将具有反馈连接的递归结构用于特征提取增强表征能力,设计新颖轻巧的特征融合策略重构特征金字塔,充分利用上下文信息,提高对复杂背景中目标的检测能力;最后,在骨干网C5层的基础上使用多个反卷积块和平均池化层构造并行输出的特征缩放分支,进一步增加特征图的分辨率,提高对小目标的检测精度.在本文构造的工程车辆APEV数据集和公开的PASCAL VOC数据集上分别进行对比实验,结果表明,所提方法的检测速度在满足工程应用需求的前提下,检测精度比原始RetinaNet网络分别提升4.9%和2.7%,与Faster R-CNN、SSD、YOLOv3、YOLOv5、LSN、S-RetinaNet等方法相比精度更高.

     

    Abstract: Aiming at the problems of poor detection accuracy caused by complex and changeable backgrounds, small objects and large scale changes in UAV aerial photography transmission corridor images, the detection method of recursive fusion and parallel scaling for engineering vehicles based on RetinaNet is proposed in this paper. This method is more suitable for detecting engineering vehicles in complex backgrounds. Firstly, the C2 layer is added as the base layer, which is used to generate feature pyramids together with the original backbone output layers to avoid small object features being highly compressed. Secondly, the original feature pyramid structure is adjusted, and recursive structure with feedback connections is used for feature extraction to enhance the characterization ability. Moreover, a novel and lightweight feature fusion strategy is designed to reconstruct the feature pyramid and makes full use of contextual information to improve the object detection capability in complex backgrounds. Finally, the parallel feature scaling branch is constructed with multiple deconvolution blocks and average pooling layers based on the C5 layer of the backbone to further increase the resolution of the feature maps and improve the detection accuracy of small objects. Experiments are carried out on the engineering vehicle APEV dataset constructed in this paper and the public Pascal VOC dataset, respectively. The experimental results show that the detection accuracy of the proposed method on the APEV dataset and the VOC dataset is 4.9% and 2.7% higher than that of the original RetinaNet network on the premise of meeting the requirements of engineering applications, respectively, Further, the proposed method also has higher detection accuracy compared with Faster R-CNN, SSD, YOLOv3, YOLOv5, LSN, S-RetinaNet and other methods.

     

/

返回文章
返回