唐德权, 黄金贵. 基于图数据的极大频繁子树挖掘算法研究[J]. 微电子学与计算机, 2020, 37(10): 54-58.
引用本文: 唐德权, 黄金贵. 基于图数据的极大频繁子树挖掘算法研究[J]. 微电子学与计算机, 2020, 37(10): 54-58.
TANG De-quan, HUANG Jin-gui. Algorithm of maximum frequent sub-tree mining based on graph data[J]. Microelectronics & Computer, 2020, 37(10): 54-58.
Citation: TANG De-quan, HUANG Jin-gui. Algorithm of maximum frequent sub-tree mining based on graph data[J]. Microelectronics & Computer, 2020, 37(10): 54-58.

基于图数据的极大频繁子树挖掘算法研究

Algorithm of maximum frequent sub-tree mining based on graph data

  • 摘要: 由于极大频繁子树中已经隐含了所有频繁子树信息,尤其处理大型图数据集时候,挖掘极大频繁子树对提高频繁子树挖掘算法效率具有重要意义.首先在有效编码的基础上提出连接和扩展操作算法,通过两个算法产生所有极大候选子树;其次引入嵌入集计算解决子树同构问题,对子树同构问题进行了优化,进一步提出了一种新的极大频繁子树挖掘算法(MFST);最后证明了算法的正确性和分析了算法在最坏情况下的时间性能,并与其它基于半结构化数据集的频繁子树挖掘算法进行了比较.实验结果表明,MFST算法具有更好的时间性能和空间性能,可以在图数据集中有效挖掘频繁子树.

     

    Abstract: In view of the the current frequent subtree mining problems, the maximal frequent subtree mining algorithm can improve the efficiency of the frequent subtree mining algorithm. Firstly, propose two operations of the join and extension on the basis of effective coding, which are generate all candidate subtrees. Secondly, the embedded set is used to solve the problem of subtree isomorphism, and a new maximum frequent subtree mining algorithm(MFST) is proposed. Finally, the correctness of the algorithm is proved and the time performance of the algorithm in the worst case is analyzed. And compared with other frequent subtree mining algorithms based on semi-structured data sets, prove this algorithm is superior to the current frequent subtree mining algorithms. The experimental results show that the MFST algorithm has better time and spatial performance, and can effectively mining frequent subtree on the graph data sets.

     

/

返回文章
返回