李超凡, 马凯. 基于多通道注意力机制的文本分类模型[J]. 微电子学与计算机, 2022, 39(4): 33-40. DOI: 10.19304/J.ISSN1000-7180.2021.1184
引用本文: 李超凡, 马凯. 基于多通道注意力机制的文本分类模型[J]. 微电子学与计算机, 2022, 39(4): 33-40. DOI: 10.19304/J.ISSN1000-7180.2021.1184
LI Chaofan, MA Kai. Text classification model based on multi-channel attention mechanism[J]. Microelectronics & Computer, 2022, 39(4): 33-40. DOI: 10.19304/J.ISSN1000-7180.2021.1184
Citation: LI Chaofan, MA Kai. Text classification model based on multi-channel attention mechanism[J]. Microelectronics & Computer, 2022, 39(4): 33-40. DOI: 10.19304/J.ISSN1000-7180.2021.1184

基于多通道注意力机制的文本分类模型

Text classification model based on multi-channel attention mechanism

  • 摘要: 为解决卷积神经网络(CNN)和循环神经网络(RNN)处理文本分类任务时,由于文本特征稀疏造成的关键特征信息丢失、模型性能不高和分类效果不佳等问题.提出一种基于多通道注意力机制的文本分类模型,首先利用字词融合的形式进行向量表示,然后利用CNN和BiLSTM提取文本的局部特征和上下文关联信息,接着以注意力机制对各通道的输出信息进行特征加权,凸显特征词在上下文信息的重要程度,最后将输出结果进行融合,并使用softmax计算文本类别概率.在数据集的对比实验结果表明,所提模型的分类效果更为优异.相较于单个通道的模型分类效果,F1值分别提升1.44%和1.16%,验证了所提模型在处理文本分类任务的有效性.该模型互补了CNN和BiLSTM提取特征的缺点,有效的缓解了CNN丢失词序信息和BiLSTM处理文本序列的梯度问题,能够有效地统筹文本的局部和全局特征,并进行关键信息凸显,从而获取更为全面的文本特征,因此适用于文本分类任务.

     

    Abstract: In order to solve the problems of loss of key feature information, poor model performance and classification effect due to sparse text features when processing text classification tasks with convolutional neural networks (CNN) and recurrent neural networks (RNN). This paper proposes a text classification model based on multi-channel attention mechanism. Firstly, the vector representation is performed using a form of character and word fusion. Then, using CNN and BiLSTM to extract local features and contextual information of the text. The attention mechanism is used to weight the output information of each channel to highlight the importance of the feature words in the context information. Finally, the output results are fused and the text category probabilities are calculated using softmax. The results of comparative experiments on the data set show that the proposed model has a better classification effect. Compared with the classification effect of the model for a single channel, the F1 values are improved by 1.44% and 1.16%, respectively, which verifies the effectiveness of the proposed model in handling the text classification task. The proposed model complements the shortcomings of CNN and BiLSTM in extracting features, and effectively alleviates the problem of CNN losing word order information and the problem of gradients in BiLSTM processing text sequences. The model can effectively integrate local and global features of text and highlight key information to obtain a more comprehensive text feature, so it is suitable for text classification tasks.

     

/

返回文章
返回