周末, 宋玉蓉, 宋波, 苏晓萍. 融合自注意力机制的D-BGRU文本分类模型[J]. 微电子学与计算机, 2021, 38(12): 8-16. DOI: 10.19304/J.ISSN1000-7180.2021.0540
引用本文: 周末, 宋玉蓉, 宋波, 苏晓萍. 融合自注意力机制的D-BGRU文本分类模型[J]. 微电子学与计算机, 2021, 38(12): 8-16. DOI: 10.19304/J.ISSN1000-7180.2021.0540
ZHOU Mo, SONG Yurong, SONG Bo, SU Xiaoping. Text categorization on D-BGRU with self-attention mechanism[J]. Microelectronics & Computer, 2021, 38(12): 8-16. DOI: 10.19304/J.ISSN1000-7180.2021.0540
Citation: ZHOU Mo, SONG Yurong, SONG Bo, SU Xiaoping. Text categorization on D-BGRU with self-attention mechanism[J]. Microelectronics & Computer, 2021, 38(12): 8-16. DOI: 10.19304/J.ISSN1000-7180.2021.0540

融合自注意力机制的D-BGRU文本分类模型

Text categorization on D-BGRU with self-attention mechanism

  • 摘要: 针对传统循环神经网络(RNN)建模时压力过大且容易忽略局部细节特征以及卷积神经网络(CNN)无法捕获远距离依赖关系的问题,提出了一种基于中断信息机制的文本分类模型方法.该方法将中断信息流的思想引入双向门控循环单元(BGRU)中,既能提取上下文远距离依赖关系又具有类似卷积核的位置不变性,从而兼顾到文本的时间特征及空间特征.在此基础上融合了自注意力机制,进一步学习特征之间的依赖关系,为重要特征分配较大权值以降低噪声冗余,强化模型对关键信息的提取能力,实现文本特征的优化操作.在AGnews、DBPedia、Yelp P.等5个真实数据集上进行实验,该方法的准确率较多个基线算法均有提升,分别达到了95.8%、99.7%、98.1%、70.4%、77.5%,验证了该模型能够更有效的实现文本分类,具有良好的应用前景.

     

    Abstract: Aiming at the problems that the traditional recurrent neural network (RNN) modeling is too stressful and it is easy to ignore the local details and the convolutional neural network (CNN) cannot capture the long-distance dependencies, a text classification model method based on disconnected information flow is proposed. This method introduces the disconnected information flow into the bidirectional gated recurrent unit (BGRU), which can extract the long-distance dependence of the context and has the feature position invariance similar to the convolution kernel, thus taking into account the temporal and spatial characteristics of the text. On this basis, the self-attention mechanism is integrated to further learn the dependencies between features, assign larger weights to important features to reduce noise redundancy, strengthen the model's ability to extract key information, and realize the optimization of text features. Experiments on five real data sets including AGnews, DBPedia, Yelp P., etc., the accuracy of this method is higher than that of multiple baseline algorithms, reaching 95.8%, 99.7%, 98.1%, 70.4%, 77.5% respectively. It is verified that the model can realize text categorization more effectively and has good application prospects.

     

/

返回文章
返回