Abstract:
Unmanned Aerial Vehicles (UAVs) are widely used in many fields, such as environmental monitoring, resource planning and power patrol inspection. There are a large number of small targets in their aerial images, which makes the target detection task difficult. Therefore, based on YOLOv5
s, a target detection network (HFTT-Net) algorithm based on high-performance feature extraction and task decoupling is proposed. First, aiming at the difficulty of feature extraction of small and medium-sized targets in aerial images, the multi-head self-attention mechanism is introduced on the basis of the original backbone network to make the network pay full attention to the small target information, and the SPD (Space-to-Depth) component is used in the multi-scale feature fusion process to enhance the features of the target to be detected. Secondly, for the common problem of task conflict in target detection, the classification head and regression head are decoupled to further improve the target detection accuracy. Finally, combined with the regression loss based on EIoU, the network is supervised to improve the convergence speed of the network and realize the accurate detection of targets in the aerial image of UAV. The experimental results on the VisDrone2019 dataset show that the high-performance feature extraction and task decoupling operation in HFTT-Net can fully improve the network's small target detection ability, which is outstanding in the multi-dense small target scene tasks of aerial images. The accuracy of this algorithm is improved by 2.5% compared with the classic YOLOv5
s algorithm under the condition that it can meet real-time detection.