陈欣.基于注意力机制和空洞卷积的自然场景弯曲文本检测方法[J]. 微电子学与计算机,2023,40(8):10-18. doi: 10.19304/J.ISSN1000-7180.2022.0498
引用本文: 陈欣.基于注意力机制和空洞卷积的自然场景弯曲文本检测方法[J]. 微电子学与计算机,2023,40(8):10-18. doi: 10.19304/J.ISSN1000-7180.2022.0498
CHEN X. Resnet squeeze and excitation dilation jaccard progressive scale expansion network[J]. Microelectronics & Computer,2023,40(8):10-18. doi: 10.19304/J.ISSN1000-7180.2022.0498
Citation: CHEN X. Resnet squeeze and excitation dilation jaccard progressive scale expansion network[J]. Microelectronics & Computer,2023,40(8):10-18. doi: 10.19304/J.ISSN1000-7180.2022.0498

基于注意力机制和空洞卷积的自然场景弯曲文本检测方法

Resnet squeeze and excitation dilation jaccard progressive scale expansion network

  • 摘要: 自然场景的弯曲文本检测技术多用于智慧旅游场景. 针对当前弯曲文本检测存在的受到卷积神经网络的感受野大小和提取特征能力有待提升的影响,网络难以识别自然场景图像中的文本和非文本区域问题,提出了一种基于注意力机制和空洞卷积的自然场景下文本检测方法(Resnet Squeeze and Excitation Dilation Jaccard Progressive Scale Expansion Network, RSDJ-PSE). RSDJ-PSE引入软注意力机制SE块在检测网络的骨干网络中,进一步增强了特征提取能力,接着引入空洞卷积到骨干网络中,扩展了卷积的感受野且不增大参数量,最后使用Jaccard系数替换Dice系数在后处理算法中,提升了该文本检测方法的F值. 在定向文本数据集ICDAR2015、标准弯曲文本数据集CTW1500和Total-Text数据集上的检测结果表明:与8种检测方法对比,该方法具有最好的文本检测性能.

     

    Abstract: Curved text detection technology in natural scenes is mostly used in intelligent tourism scenes. Due to the influence of the receptive field size and feature extraction ability of convolutional neural network on the current curved text detection, the network is difficult to identify text and non text areas in natural scene images. A text detection method based on attention mechanism and cavity convolution in natural scene (Resnet Squeeze and Exception Diffusion Jacob Progressive Scale Expansion Network RSDJ-PSE) is proposed. RSDJ-PSE introduces the soft attention mechanism SE block into the backbone network of the detection network, which further enhances the feature extraction capability. Then it introduces the hole convolution into the backbone network, which expands the receptive field of the convolution without increasing the number of parameters. Finally, it uses the Jackard coefficient to replace the Dice coefficient in the post-processing algorithm, which improves the F value of the text detection method. The detection results on directional text dataset ICDAR2015, standard curved text dataset CTW1500 and Total Text dataset show that this method has the best text detection performance compared with eight detection methods.

     

/

返回文章
返回