Abstract:
With the development of deep learning, the text detection of natural scenes has made progress, but the detection effect of multi-directional and curved Chinese text is still not ideal. A multi-scale text detection method isproposed, whichintegrates attention mechanism for the detection of multi-directional and curved Chinese text. In order to balance model accuracy and reduce computational complexity, a lightweight Resnet18 backbone network is adopted. Aiming at the problem of the uncertainty of the feature distribution extracted by the feature pyramid (FPN), the embedded balanced attention mechanism (BAM) extracts effective text features and suppresses inefficient feature channels, thereby improving the robustness of the detection method. Aiming at the problem of the loss of image local information and detail information during downsampling of the Hollow Space Pyramid Pooling Network (ASPP), ASPP is improved to reduce the loss of feature map resolution. Aiming at the problem of insufficient FPN feature extraction and small perception field, the FPN embedded in the attention mechanism and the improved ASPP parallel enhanced feature extraction are fused. Aiming at the problem of the imbalance of positive and negative samples, the logarithmic AC Loss is introduced into the binary graph loss based on the differentiable binarization module, thereby enhancing the generalization ability of the detection model. The experimental results on the public data set MSRA-TD500 show that compared with the current fast and efficient DBnet, the accuracy, recall and F value of this algorithm are increased by 0.1%, 1.4% and 0.6% respectively, and the detection rate of this algorithm is also has a good performance.