Research on text classification method based on BERT-AWC

LI Jinbiao; HOU Jin; LI Chen; CHEN Zirui; HE Chuan

doi:10.19304/J.ISSN1000-7180.2021.1264

LI Jinbiao, HOU Jin, LI Chen, CHEN Zirui, HE Chuan. Research on text classification method based on BERT-AWC[J]. Microelectronics & Computer, 2022, 39(6): 41-50. DOI: 10.19304/J.ISSN1000-7180.2021.1264

Citation:

Research on text classification method based on BERT-AWC

Abstract

Abstract

Aiming at the problems of low classification accuracy, huge amount of parameters, and difficulty in training models when processing Chinese data in existing text classification algorithms, the BERT algorithm is optimized. The BERT algorithm cannot extract word vector features when processing Chinese text. For this reason, a uniform word vector convolution module AWC is proposed. By introducing the attention mechanism into the traditional convolutional neural network to extract reliable word vector features, and then further obtain the local features of the text, this makes up for the shortcomings of the BERT model that cannot extract word vectors. The self-attention network of the BERT model itself can extract the global features of the text to highlight the key meaning of the full text. At the same time, local features are introduced into the BERT algorithm, by describing the local features and global features of the text according to the degree of importance. The fusion finally generates richer text information. Input the fused features into the softmax layer to get the classification result of the model. The use of balanced multi-head design, hierarchical parameter sharing mechanism, fully connected layer optimization and other methods greatly reduces the amount of model parameters under the premise of ensuring the accuracy of the algorithm, and finally forms a BERT-AWC lightweight text classification based on a hybrid attention mechanism algorithm. The experimental results on multiple public data sets show that compared with the benchmark algorithm BERT, the prediction accuracy of the BERT-AWC algorithm on multiple public data sets has improved by 1~5%, and the model parameters are only BERT′s 3.6%, which met the design expectations.

FullText(HTML)

References (26)

Relative Articles

Supplements (0)

Cited By

Turn off MathJax

Article Contents

Research on text classification method based on BERT-AWC

Abstract

Catalog

Export File

Citation

Format

Content