基于多头类特定残差注意力和图卷积的多标签图像分类算法

龚亮威; 宣士斌; 李培杰; 李然

doi:10.19304/J.ISSN1000-7180.2022.0576

基于多头类特定残差注意力和图卷积的多标签图像分类算法

Multi-label image classification algorithm based on multi-head class-specific residual attention and graph convolution

摘要

摘要: 针对ML-GCN中全局最大池化所获得的图像特征对特定类别在不同图像区域上缺乏针对性和丢失图像局部特征信息的问题,提出了类特定残差注意力(CSRA)模块. 该模块可以有效捕获不同类别对象所占据的不同空间区域. 此外,将提出的类特定残差注意力与图卷积神经网络相结合,提出了基于多头类特定残差注意力与图卷积的多标签图像分类算法(ML-CSRA). 首先利用卷积神经网络提取通用的图像特征图,之后将提出的类特定残差注意力扩展为多头形式,并将其应用于通过卷积神经网络提取到的通用图像特征图,提取各个区域对应不同类别的特征. 最后将图卷积神经网络提取的标签相关特征与多头类特定残差注意力提取的图像特征结合,得到最后的多标签图像分类结果. 在MS-COCO 2014和VOC-2007数据集上的实验结果表明提出算法在所有评估指标上都优于目前已有算法.

Abstract: A simple and efficient class-specific residual attention (CSRA) module is proposed to solve the problem that the image features obtained by global max pooling in ML-GCN lack pertinence in different image regions for specific categories and lose image local feature information. . This module can effectively capture different spatial regions occupied by different classes of objects. Furthermore, combining the proposed class-specific residual attention with graph convolutional neural networks, a multi-label image classification algorithm (ML-CSRA) based on multi-head class-specific residual attention and graph convolution is proposed. First, the general image feature map is extracted by convolutional neural network, and then the proposed class-specific residual attention is extended to the multi-head form, and it is applied to the general image feature map extracted by the convolutional neural network. Regions correspond to different categories of features to be extracted. Finally, the label-related features extracted by the graph convolutional neural network are combined with the image features extracted by the multi-head class-specific residual attention to obtain the final multi-label image classification result. The experimental results on MS-COCO 2014 and VOC-2007 datasets show that the proposed algorithm outperforms existing algorithms on all evaluation metrics.

HTML全文

参考文献(20)

施引文献

资源附件(0)