刘博, 李云, 张晓斌, 徐杰. 云环境下基于二进制编码聚类的并行频繁项集挖掘算法[J]. 微电子学与计算机, 2012, 29(11): 62-65.
引用本文: 刘博, 李云, 张晓斌, 徐杰. 云环境下基于二进制编码聚类的并行频繁项集挖掘算法[J]. 微电子学与计算机, 2012, 29(11): 62-65.
LIU Bo, LI Yun, ZHANG Xiao-bin, XU Jie. A Parallel Frequent Itemsets Mining Algorithm Based on Binary Coding and Clustering under Cloud Environment[J]. Microelectronics & Computer, 2012, 29(11): 62-65.
Citation: LIU Bo, LI Yun, ZHANG Xiao-bin, XU Jie. A Parallel Frequent Itemsets Mining Algorithm Based on Binary Coding and Clustering under Cloud Environment[J]. Microelectronics & Computer, 2012, 29(11): 62-65.

云环境下基于二进制编码聚类的并行频繁项集挖掘算法

A Parallel Frequent Itemsets Mining Algorithm Based on Binary Coding and Clustering under Cloud Environment

  • 摘要: 本文提出了一种云环境下基于二进制编码的并行频繁项集挖掘算法,利用一种特殊的二进制编码的依赖度计量方法对原始数据集合进行编码转换及依赖度聚类,然后将数据集分布部署在云环境中,并采用共享多头表的FP-Growth并行改进算法挖掘频繁项集.实验表明,对于大规模数据集来说,本文算法可以取得良好的性能.

     

    Abstract: This paper proposes a parallel frequent itemsets mining algorithm based on binary coding under cloud environment.A special binary coding dependency calculating method is adopted to transfer the raw data and cluster based on dependency, then the data is distributed deployed in cloud environment and the parallel improved algorithm of FP-Growth based on shared multi-head table is used to mine frequent item sets.Experiments show that the algorithm performed nicely with large scale of data sets.

     

/

返回文章
返回