融合注意力机制的多标签文本分类

刘杰; 唐宏; 杨浩澜; 甘陈敏; 彭金枝

doi:10.19304/J.ISSN1000-7180.2022.0416

融合注意力机制的多标签文本分类

Multi-label text classification method fused with attention mechanism

摘要

摘要: 多标签文本分类的结果很大程度上受到标签相关性的影响.为了更加细致地处理标签相关性问题,提出一种融合注意力机制的多标签文本分类方法. 首先,将文本和标签预处理后,对标签输入采用两种不同的嵌入方式提取特征;其次,运用注意力机制处理信息,针对文本和标签信息,自注意力机制进行特征处理,标签注意力机制和交互注意力机制进行依赖关系处理,进而得到两种不同状态下的表示方式;最后,通过两次融合,充分表示文本标签信息,得到较好的标签分类结果. 实验结果显示,较之于基线方法,在精度和归一化折损累计增益上,该方法数据总体有所提高. 由此,该方法可以有效地融合文本和标签信息,缓解标签相关性问题,有利于提升多标签文本分类任务性能.

Abstract: The results of multi-label text classification are largely affected by label correlation. In order to deal with the label correlation problem in more detail, a multi-label text classification method fused with attention mechanism is proposed. Firstly, after preprocessing the text and labels, two different embedding methods are used to extract features for label input; Secondly, the attention mechanism is used to process the information, for the text and label information, the self-attention mechanism is used for feature processing, the label attention mechanism and the interactive attention mechanism are used for dependency processing, and then the representations in two different states are obtained. Finally, when the text label information is combined twice, the text label information is fully represented and better label classification results are obtained. The experimental results show that the data presented by this method generally improve the precision and normalized discounted cumulative gain compared with the baseline method. Thus, the method can effectively fuse text and label information, alleviate label correlation problem, and help improve the performance of multi-label text classification tasks.

HTML全文

参考文献(30)

施引文献

资源附件(0)