陈浩. 统计指标离散化方法及应用[J]. 微电子学与计算机, 2011, 28(11): 106-109.
引用本文: 陈浩. 统计指标离散化方法及应用[J]. 微电子学与计算机, 2011, 28(11): 106-109.
CHEN Hao. A Discretization Method Based on Statistical Indicator[J]. Microelectronics & Computer, 2011, 28(11): 106-109.
Citation: CHEN Hao. A Discretization Method Based on Statistical Indicator[J]. Microelectronics & Computer, 2011, 28(11): 106-109.

统计指标离散化方法及应用

A Discretization Method Based on Statistical Indicator

  • 摘要: 连续数据离散化能够提高数据挖掘算法的分类能力.文中提出一种基于统计指标的连续属性离散化方法,凭借相关系数衡量类与属性间的关联度,获取最优区间列表.引入变精度粗糙集模型,有效地控制数据由离散化导致的信息丢失.该方法在声纳传感器数据识别以及其它领域上进行了应用.实验结果表明,该方法在J48决策树上有很好的分类能力.

     

    Abstract: Discretization of continuous data can improve classification ability of data mining algorithms.This paper proposes a discretization method based on statistical indicator.It measures the interdependence between class label and attributes with the aim to find optimal interval lists by means of correlation coefficent.It also introduces variable precision rough set model to effectively control information loss generated by discretization.The presented method is applied to sonar sensor data recognition and other fields.Experimental results show that this method significantly improves the classification ability on J48 decision tree

     

/

返回文章
返回