施亮, 钱雪忠. 基于Hadoop的并行FP-Growth算法的研究与实现[J]. 微电子学与计算机, 2015, 32(4): 150-154.
引用本文: 施亮, 钱雪忠. 基于Hadoop的并行FP-Growth算法的研究与实现[J]. 微电子学与计算机, 2015, 32(4): 150-154.
SHI Liang, QIAN Xue-zhong. Research and Implementation of Parallel FP-Growth Algorithm Based on Hadoop[J]. Microelectronics & Computer, 2015, 32(4): 150-154.
Citation: SHI Liang, QIAN Xue-zhong. Research and Implementation of Parallel FP-Growth Algorithm Based on Hadoop[J]. Microelectronics & Computer, 2015, 32(4): 150-154.

基于Hadoop的并行FP-Growth算法的研究与实现

Research and Implementation of Parallel FP-Growth Algorithm Based on Hadoop

  • 摘要: 在PFP(Parallel FP-Growth)算法的基础上提出了一种负载均衡并行的挖掘算法LBPFP(Load-Balanced Parallel FP-Growth)算法,该算法在Hadoop框架实现并行计算的同时,在数据分发中利用负载均衡策略,使主节点均衡地向子节点分配数据.除此之外,在子节点进行数据处理的过程中利用剪枝策略,减少数据的处理量,该算法在提高并行计算能力的同时也大大缩小了数据的处理量.最后,通过实验分析表明该算法在大数据的处理中具有较好的效果,证实了该算法的可行性.

     

    Abstract: This paper,we propose a load-balanced and parallel FP-Growth algorithm based on Map/Reduce, which evenly parallelizes FP-Growth in the MapReduce approach. LBPFP(Load-Balanced Parallel FP-Growth) adds into PFP(Parallel FP-Growth) the load balance feature and the effectively pruning strategy, which improves parallelization and thereby improves performance. Finally,the experimental result shows the algorithm has good effect in the large data processing. It is proved the feasibility of the algorithm.

     

/

返回文章
返回