胡强. 基于两种特征影响度的特征选择[J]. 微电子学与计算机, 2010, 27(12): 65-68.
引用本文: 胡强. 基于两种特征影响度的特征选择[J]. 微电子学与计算机, 2010, 27(12): 65-68.
HU Qiang. Feature Selection Based on Two Kinds of Feature Influence Degree[J]. Microelectronics & Computer, 2010, 27(12): 65-68.
Citation: HU Qiang. Feature Selection Based on Two Kinds of Feature Influence Degree[J]. Microelectronics & Computer, 2010, 27(12): 65-68.

基于两种特征影响度的特征选择

Feature Selection Based on Two Kinds of Feature Influence Degree

  • 摘要: 定义了两种特征影响度:一种是特征对类间文档分散程度的影响度, 该影响度越大越好;另一种是特征对类内文档分散程度的影响度, 该影响度越小越好.然后把这两种特征影响度有机地结合起来设计了一个新的特征选择方法.该方法能够对所选特征进行综合考虑, 从而使获得的特征集具有较好的代表性.仿真实验表明, 所提特征选择方法在一定程度上能够提高文本分类性能.

     

    Abstract: Two kinds of feature influence degree were defined: one was the feature influence degree of document dispersion degree amongst categories, the contribution that was larger was better.Another was the feature influence degree of document dispersion degree in category, the contribution that was smaller was better.And then, the two kinds of feature influence degree ware integrated organically and a new feature selection method was designed.The method can inspect selected feature synthetically so that the feature set that is more representative is obtained.Simulation experiments show that, to a certain extent, the feature selection method is able to improve performance of text categorization.

     

/

返回文章
返回