高级检索

    基于投影分支的快速频繁子树挖掘算法

    Frequent Subtree Mining Based on Projected Branch

    • 摘要: 频繁子树挖掘在生物信息、Web挖掘等很多领域都具有较高的应用价值.在频繁子树挖掘中引入投影分支的概念,并提出基于投影分支的快速频繁子树挖掘算法——FTPB. FTPB算法充分利用树结构本身的特点,在计算投影分支的同时解决树同构的判断问题,扫描数据库后能够根据当前的频繁模式树直接生成新的频繁模式树,可减少数据库的扫描次数和候选模式的搜索空间,从而降低算法复杂度.理论分析和实验结果表明,该算法较其他同类算法相比具有较高的效率,是有效可行的.

       

      Abstract: Discovering frequent subtrees from ordered labeled trees is an important research problem in data mining with broad applications in bioinformatics, web log, XML documents and so on. In this paper, A new concept of projected branch is introduced, and a new algorithm FTPB (frequent subtrees mining based on projected branch) is proposed. This algorithm does the work of distinguishing isomorphism while computing projected branch, which decreases the complexity of algorithm, improving the efficiency of the algorithm. Theoretical analysis and experimental results show that the FTPB algorithm is efficient and effective.

       

    /

    返回文章
    返回