基于MapReduce的并行加权FIUT算法

Parallel Weighted FIUT Algorithm Based on MapReduce

摘要: 针对目前大数据环境下, 传统频繁项集挖掘算法效率低下的问题, 在MapReduce框架的基础上, 结合加权模型提出了一种并行加权频繁项集挖掘算法PWFIUT(Parallel Weighted Frequent Itemset Ultrametric Tree).该算法将候选项映射到哈希表中以划分支持度, 同时避免构建条件模式和实现压缩存储.最后, 对PWFIUT算法在Hadoop平台进行了测试与分析, 实验结果表明所提出的算法具有较好的运行效率和扩展性.

Abstract: Aiming at the the inefficiency of traditional frequent itemsets mining algorithm in view of the big data environment, a solution to this problem parallel weighted mining of frequent itemsets using PWFIUT(Parallel Weighted Frequent Itemset Ultrametric Tree) algorithm is implemented on MapReduce framework. Support is counted by mapping the items from the candidate list into the buckets which is divided according to support known as Hash table structure, also to avoid building conditional patterns and to achieve compressed storage. Finally, the Algorithm is verified and analyzed on Hadoop platform. According to the compared experiment results, it shows that the proposed algorithm has high efficiency and good scalability.