FSMBUS: A Frequent Subgraph Mining Algorithm in Single Large-Scale Graph Using Spark

Yan Yuliang; Dong Yihong; He Xianmang; Wang Wei

doi:10.7544/issn1000-1239.2015.20150256

Yan Yuliang, Dong Yihong, He Xianmang, Wang Wei. FSMBUS: A Frequent Subgraph Mining Algorithm in Single Large-Scale Graph Using SparkJ. Journal of Computer Research and Development, 2015, 52(8): 1768-1783. DOI: 10.7544/issn1000-1239.2015.20150256

Citation:

FSMBUS: A Frequent Subgraph Mining Algorithm in Single Large-Scale Graph Using Spark

Graphical Abstract

Abstract

Abstract

Mining frequent subgraphs in a single large-scale graph is of huge demand with the rapid growth of the social networking. However, it is inefficient for the serial algorithms to mine frequent subgraphs in low support when mining for a single large-scale graph. Meanwhile, few existing distributed algorithms can’t support the growth pattern mining, and the Hadoop framework they worked is not suitable for iterative running. In this paper, a distributed algorithm named FSMBUS for mining frequent subgraph in a single large-scale graph under Spark framework is proposed. It constructs the parallel computing candidate subgraphs by suboptimal CAM Tree, which returns all the frequent subgraphs for given user-defined minimum support. Additionally, infrequent patterns’ test and searching order chosen is introduced to optimize the algorithm. Sorted-Greedy method is designed for data partition to balance the workload. Our experiments show that FSMBUS runs faster and more effective than the existing algorithms with real datasets,and even can run with the lower support threshold and the larger graph datasets as well. At the same time, FSMBUS runs 2～4 times faster on Spark framework than that on Hadoop framework.

FullText(HTML)

References (0)

Cited By

Turn off MathJax

Article Contents

FSMBUS: A Frequent Subgraph Mining Algorithm in Single Large-Scale Graph Using Spark

Abstract

Catalog

Export File

Citation

Format

Content