Abstract:
Discovering frequent subtrees from ordered labeled trees is an important research problem in data mining with broad applications in bioinformatics, web log, XML documents and so on. In this paper, A new concept of projected branch is introduced, and a new algorithm FTPB (frequent subtrees mining based on projected branch) is proposed. This algorithm does the work of distinguishing isomorphism while computing projected branch, which decreases the complexity of algorithm, improving the efficiency of the algorithm. Theoretical analysis and experimental results show that the FTPB algorithm is efficient and effective.