陈志奎, 吕爱玲, 张清辰. 基于属性重要性的不完备数据填充算法[J]. 微电子学与计算机, 2013, 30(7): 167-172,176.
引用本文: 陈志奎, 吕爱玲, 张清辰. 基于属性重要性的不完备数据填充算法[J]. 微电子学与计算机, 2013, 30(7): 167-172,176.
CHEN Zhikui, LV Ailing, ZHANG Qingchen. A New Algorithm for Imputing Missing Data Based on Distinguishing the Importance of Attributes[J]. Microelectronics & Computer, 2013, 30(7): 167-172,176.
Citation: CHEN Zhikui, LV Ailing, ZHANG Qingchen. A New Algorithm for Imputing Missing Data Based on Distinguishing the Importance of Attributes[J]. Microelectronics & Computer, 2013, 30(7): 167-172,176.

基于属性重要性的不完备数据填充算法

A New Algorithm for Imputing Missing Data Based on Distinguishing the Importance of Attributes

  • 摘要: 现有的不完备数据填充算法对所有缺失数据采用统一方式填充,没有考虑数据的重要性,效率低,实时性差。因此,本文提出一种基于属性重要性的不完全数据填充算法。通过差分矩阵求得属性约简,根据约简区分重要属性和非重要属性,对于重要属性数据填充采用改进的马氏距离填充方法,而不重要属性数据填充采用相似度概率填充方法,保证了数据精确度的同时,提高了实时性,具有实用性。最后,实验部分采用数据家庭系统数据和UCI标准数据集分别对算法性能进行了分析,验证了该算法的优越性。

     

    Abstract: Existing incomplete data filling algorithm are all use the same method to fill all the missing values,and did not consider the importance of each value, thus, makes all algorithms low efficiency and poor real-time. Therefore,this paper proposes a new data filling algorithm based on distinguishing the importance of attributes,it uses attribute reduction to distinguish important attributes and unimportant attributes,then,uses the improved mahalanobis-based algorithm to imputing the missing value that belong to the important attributes, and unimportant missing values according to the similarity -probabilistic method, thus,ensure that the accuracy of data,at the same time,make sure the real -time and practicality.at last,the experimental part using the Digital-home system and the UCI standard datasets to analysis the algorithm performance,verifying the superiority of the algorithm.

     

/

返回文章
返回