伍洋, 钟鸣, 姜艳, 李石君. 面向审计领域的短文本分类技术研究[J]. 微电子学与计算机, 2015, 32(1): 5-10.
引用本文: 伍洋, 钟鸣, 姜艳, 李石君. 面向审计领域的短文本分类技术研究[J]. 微电子学与计算机, 2015, 32(1): 5-10.
WU Yang, ZHONG Ming, JIANG Yan, LI Shi-jun. Study on Short Text Categorization Technology Oriented Towards Field of Auditing[J]. Microelectronics & Computer, 2015, 32(1): 5-10.
Citation: WU Yang, ZHONG Ming, JIANG Yan, LI Shi-jun. Study on Short Text Categorization Technology Oriented Towards Field of Auditing[J]. Microelectronics & Computer, 2015, 32(1): 5-10.

面向审计领域的短文本分类技术研究

Study on Short Text Categorization Technology Oriented Towards Field of Auditing

  • 摘要: 针对审计问题这种短文本所具有的特征稀疏、问题类别界限模糊问题,提出了一种改进的面向审计领域的短文本分类方法.该方法首先为审计问题构造了专门的特征集,以审计领域的同义词词集和法规库为基础,并结合特定规则来调整特征权重,然后以修改的SVM决策树作为多类分类器进行短文本分类.实验结果表明,该方法在对审计问题分类的应用上,具有较为满意的正确率,能满足实际的分类需求.

     

    Abstract: To deal with the problems of feature sparseness and fuzzy boundaries of categorization exists in classification of audit problems,an improved short text categorization method oriented towards field of auditing is put forward.Firstly,a specialized feature set is builded for audit problems,the primary calculation method is designed based on synonym word set oriented towards field of auditing as well as designated rules,law library. Then the feature weight of those words with highly similarity to target words is adjusted.Finally,the SVM decision tree is used as multi-class classifier for short text classification.Experimental results show that a satisfy result can be got with this method from problem categorization of audit reports and it can be used in practical needs.

     

/

返回文章
返回