基于ABBSAC模型的中文事件抽取方法

陈泉林; 贾珺; 樊硕

doi:10.19304/J.ISSN1000-7180.2023.0292

基于ABBSAC模型的中文事件抽取方法

Chinese event extraction method based on ABBSAC model

摘要

摘要: 事件抽取作为信息抽取的重要一环，是非结构化文本转化为有价值的结构化文本的主要方式。针对目前事件抽取模型普遍训练时间长、模型体量大等问题，提出了一个基于ABBSAC的中文事件抽取模型。通过ALBERT预训练模型缩减模型体量，采用BiSRU++捕捉文本内部关联信息，并融合注意力机制提升模型精度，最后以CRF的输出作为抽取结果。基于新浪新闻自主构建了语料集，进行了对比实验。在获得较高准确率、召回率以及F1值的基础上，该模型训练速度提高了约10%，模型参数量裁剪了约82%，证明了所提模型的先进性。同时，在ACE05和DUEE基准测评数据集上，与前沿方法相比较，将触发词抽取的F1值分别提升了1.7%、0.3%，将论元角色抽取的F1值分别提升了5.4%、0.1%，有效提升了中文事件抽取任务的效能。

Abstract: Event extraction, as an important part of information extraction, is the main way to transform unstructured text into valuable structured text. To address the problems of long training time and large model volume commonly found in current event extraction models, the paper proposes chinese event extraction model based on ABBSAC model. Reducing model size with ALBERT pre-trained models, using BiSRU++ to capture the internal association information of the text, and incorporating the attention mechanism to improve the model accuracy, and finally using the output of CRF as the extraction result. Based on Sina news, a corpus is constructed independently and a comparative experiment is carried out. The model achieves higher precision, recall and F1-score with an increase in training speed of about 10% and a cut in the number of model parameters of about 82%, demonstrating the advancedness of the proposed model. Also on the ACE05 and DUEE benchmark datasets, the F1-score for trigger extraction are improved by 1.7% and 0.3%, respectively, and the F1-score for argument role extraction are improved by 5.4% and 0.1%, when compared with the frontier method, effectively improving the effectiveness of the chinese event extraction task.

HTML全文

参考文献(33)

施引文献

资源附件(0)