刘月峰, 苑江浩, 张晓琳. 改进NB算法在垃圾邮件过滤技术中的研究[J]. 微电子学与计算机, 2017, 34(4): 115-120.
引用本文: 刘月峰, 苑江浩, 张晓琳. 改进NB算法在垃圾邮件过滤技术中的研究[J]. 微电子学与计算机, 2017, 34(4): 115-120.
LIU Yue-feng, YUAN Jiang-hao, ZHANG Xiao-lin. Improved NB Algorithm Research in Spam Filtering Technology[J]. Microelectronics & Computer, 2017, 34(4): 115-120.
Citation: LIU Yue-feng, YUAN Jiang-hao, ZHANG Xiao-lin. Improved NB Algorithm Research in Spam Filtering Technology[J]. Microelectronics & Computer, 2017, 34(4): 115-120.

改进NB算法在垃圾邮件过滤技术中的研究

Improved NB Algorithm Research in Spam Filtering Technology

  • 摘要: 朴素贝叶斯(NB)是一种简单高效的分类算法, 且在垃圾邮件过滤中得到广泛应用, 但是其属性间独立性的假设在一定程度上影响了分类效果.针对这一问题, 提出一种改进的NB算法——FOA-NB算法.该算法将NB算法与果蝇优化算法(FOA)相结合, 根据不同特征属性对分类的影响程度赋予不同的权值, 通过FOA对权值进行优化, 得到全局最优特征权向量, 该算法在保留NB算法的简洁高效的优点的同时, 通过权值优化获取更加具有决策性的特征属性, 从而提高垃圾邮件过滤的正确率和召回率.通过仿真实验与NB算法、加权贝叶斯(WB)进行对比, 结果表明FOA-NB算法使得垃圾邮件过滤效果得到明显改善, 正确率和召回率均有所提高, 且提高幅度约为5%.

     

    Abstract: Naive Bayes (NB) is a simple and efficient classification algorithm, and it is widely used in spam filtering, but because of the independence between the attributes of the hypothesis which has to some extent affected in classification effect. To solve this problem, the FOA-NB algorithm is proposed which is an improved NB algorithm. The algorithm of NB algorithm and Fruit fly optimization algorithm (FOA) combination, according to the different feature attributes of the influence degree of the classification given different weights, to optimize the weights by FOA and get the global optimal feature weight vector, the algorithm in the NB algorithm retains the advantage of simple and efficient at the same time, by optimization of the weights to obtain attributes which have more decision-making, so as to improve the spam filtering correct rate and recall rate. Through the simulation experiment with NB algorithm, Weighted Bayesian (WB), the results show that the FOA-NB algorithm makes the spam filtering effect has been improved significantly, and the correct rate and recall rate are improved, and the increase of about 5%.

     

/

返回文章
返回